In this insightful discussion, Prasanna Pendse, Global Director of AI Strategy, and Shayan Mohanty, Head of AI Research, share their expertise on the revolutionary AI start-up DeepSeek. They dive into how DeepSeek’s R1 reasoning model differentiates itself from giants like OpenAI. The duo tackles misconceptions about AI training costs, the impact of hardware limitations, and innovative strategies to optimize performance. They also explore the implications of these developments on the tech industry’s economic landscape and the complexities surrounding model licensing.
DeepSeek's R1 model showcases innovative optimizations that tackle hardware limitations imposed by export controls, demonstrating creative problem-solving in AI development.
The aggressive pricing strategy of DeepSeek, offering API access at significantly lower rates than competitors, raises concerns about potential market sustainability and predatory practices.
Deep dives
Understanding DeepSeek's Emergence
DeepSeek, a startup from China, launched a model that reportedly competes with high-profile AI systems, including OpenAI's offerings. The company claimed a significantly lower training cost of approximately $5.6 million for their R1 model, leading to widespread assumptions that this would allow anyone to replicate or outperform existing models for a fraction of the cost. However, misinterpretations of this claim fueled misconceptions about the feasibility of training advanced models on similar budgets. The app's rapid adoption and free access played a crucial role in its rising popularity, further intensified by the geopolitical context surrounding AI development in the region.
Technological Innovations and Constraints
DeepSeek's development of the R1 model involved sophisticated optimizations, particularly in light of hardware restrictions imposed by U.S. export controls on certain GPUs. The company worked with the restricted H800 chips and used innovative co-design techniques to maximize performance despite hardware limitations. This deep level of optimization was essential, as the architecture of their model built upon pre-existing frameworks like Llama and Quen required careful integration to achieve successful training with their available resources. These advancements illustrate how hardware constraints can lead to creative problem-solving and significant improvements in model efficiency.
Economic Implications of Pricing Strategies
DeepSeek's aggressive pricing strategy, offering their API access at $2 per million tokens compared to OpenAI's approximately $60 per million tokens, raises questions about the long-term sustainability of such pricing. While this initiative could enhance accessibility in AI, it also suggests potential predatory pricing practices in a competitive environment. The lower prices are linked to DeepSeek's design focus on cost efficiency, but the actual operating costs and economic feasibility remain unclear without further transparency on their pricing model. As the AI landscape evolves, these competitive dynamics could significantly reshape how companies price their services and the accessibility of advanced AI technologies.
The release of DeepSeek's AI models at the end of January 2025 sent shockwaves around the world. The weeks that followed have been rife with hype and rumor, ranging from suggestions that DeepSeek has completely upended the tech industry to claims the efficiency gains ostensibly unlocked by DeepSeek are exagerrated. So, what's the reality? And what does it all really mean for the tech industry?
In this episode of the Technology Podcast, two of Thoughtworks' AI leaders — Prasanna Pendse (Global Director of AI Strategy) and Shayan Mohanty (Head of AI Research) — join hosts Prem Chandrasekaran and Ken Mugrage to provide a much-needed clear and sober perspective on DeepSeek. They dig into some of the technical details and discuss how the DeepSeek team was able to optimize the limited hardware at their disposal, and think through what the implications might be for the industry in the months to come.
Read Prasanna's take on DeepSeek on the Thoughtworks blog: https://www.thoughtworks.com/insights/blog/generative-ai/demystifying-deepseek
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode