
Episode 40: DeepSeek facts vs hype, model distillation, and open source competition
Mixture of Experts
00:00
DeepSeek V3 Training Costs Explained
This chapter examines the actual implications of the reported $5.6 million training cost for the DeepSeek V3 model, clarifying that it is often misunderstood in context. It emphasizes advancements in efficiency through the DeepSeek R1 model and the shift towards effective post-training strategies in reinforcement learning.
Transcript
Play full episode