Mixture of Experts cover image

Episode 40: DeepSeek facts vs hype, model distillation, and open source competition

Mixture of Experts

CHAPTER

DeepSeek V3 Training Costs Explained

This chapter examines the actual implications of the reported $5.6 million training cost for the DeepSeek V3 model, clarifying that it is often misunderstood in context. It emphasizes advancements in efficiency through the DeepSeek R1 model and the shift towards effective post-training strategies in reinforcement learning.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner