Thoughtworks Technology Podcast

Decoding DeepSeek

71 snips
Feb 6, 2025
In this insightful discussion, Prasanna Pendse, Global Director of AI Strategy, and Shayan Mohanty, Head of AI Research, share their expertise on the revolutionary AI start-up DeepSeek. They dive into how DeepSeek’s R1 reasoning model differentiates itself from giants like OpenAI. The duo tackles misconceptions about AI training costs, the impact of hardware limitations, and innovative strategies to optimize performance. They also explore the implications of these developments on the tech industry’s economic landscape and the complexities surrounding model licensing.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

DeepSeek's Rise

  • DeepSeek is a Chinese startup with a model rivaling OpenAI's in reasoning tasks.
  • Its model training cost, misconstrued as $5.6 million total, fueled its popularity and app store success.
INSIGHT

DeepSeek's Model and Optimization

  • DeepSeek's model is built upon Meta's Llama and Alibaba's Quen, using post-training techniques.
  • It excels in reasoning tasks, optimizing H800 chips due to US export controls.
INSIGHT

Debunking the $5.6 Million Myth

  • DeepSeek's $5.6 million figure represents only the last training run's cost, not total R1 development.
  • This excludes prior iterations, prerequisite models, and reinforcement learning costs.
Get the Snipd Podcast app to discover more snips from this episode
Get the app