Mixture of Experts cover image

Mixture of Experts

Episode 40: DeepSeek facts vs hype, model distillation, and open source competition

Jan 31, 2025
In this engaging discussion, Kate Soule, Director of Technical Product Management at Granite, Chris Hay, Distinguished Engineer and CTO of Customer Transformation, and Aaron Baughman, IBM Fellow and Master Inventor dive into the realities behind DeepSeek R1. They debunk myths surrounding its hype and discuss the true implications of model distillation for AI competition. The trio explores the evolving landscape of open-source AI and how recent advancements can reshape industry strategy, shedding light on efficiency and innovation in model training.
39:17

Podcast summary created with Snipd AI

Quick takeaways

  • The $5.5 million cost to train DeepSeek R1 reflects only part of model development, often omitting extensive preparation and data collection necessities.
  • DeepSeek's introduction of model distillation facilitates the creation of efficient student models from robust teacher models, enhancing AI innovation and accessibility.

Deep dives

Debunking the $5.5 Million Myth

The claim that training state-of-the-art models like DeepSeek R1 costs around $5.5 million has sparked significant debate, and it's essential to understand the context behind this figure. While this number reflects a specific iteration of training a base model, it overlooks the extensive preparation, including months of practice and data collection, necessary to achieve performance in real-world applications. The costs associated with actual model development include various factors, such as hardware, research, and earlier training phases, which can far exceed the cited amount. As a result, presenting the $5.5 million figure without considering these caveats is misleading and does not accurately depict the true expenses of model development in artificial intelligence.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner