

Decoding DeepSeek
71 snips Feb 6, 2025
In this insightful discussion, Prasanna Pendse, Global Director of AI Strategy, and Shayan Mohanty, Head of AI Research, share their expertise on the revolutionary AI start-up DeepSeek. They dive into how DeepSeek’s R1 reasoning model differentiates itself from giants like OpenAI. The duo tackles misconceptions about AI training costs, the impact of hardware limitations, and innovative strategies to optimize performance. They also explore the implications of these developments on the tech industry’s economic landscape and the complexities surrounding model licensing.
AI Snips
Chapters
Transcript
Episode notes
DeepSeek's Rise
- DeepSeek is a Chinese startup with a model rivaling OpenAI's in reasoning tasks.
- Its model training cost, misconstrued as $5.6 million total, fueled its popularity and app store success.
DeepSeek's Model and Optimization
- DeepSeek's model is built upon Meta's Llama and Alibaba's Quen, using post-training techniques.
- It excels in reasoning tasks, optimizing H800 chips due to US export controls.
Debunking the $5.6 Million Myth
- DeepSeek's $5.6 million figure represents only the last training run's cost, not total R1 development.
- This excludes prior iterations, prerequisite models, and reinforcement learning costs.