Mixture of Experts cover image

Episode 40: DeepSeek facts vs hype, model distillation, and open source competition

Mixture of Experts

00:00

DeepSeek V3 Training Costs Explained

This chapter examines the actual implications of the reported $5.6 million training cost for the DeepSeek V3 model, clarifying that it is often misunderstood in context. It emphasizes advancements in efficiency through the DeepSeek R1 model and the shift towards effective post-training strategies in reinforcement learning.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app