Sharp Tech with Ben Thompson

(Preview) 72 Hours of DeepSeek Hysteria, What DeepSeek Means for Big Tech, Lessons on the Efficacy of Chip Controls

Jan 27, 2025
DeepSeek stirs a whirlwind of reactions, spotlighting its hefty model training costs and transparency issues. The conversation shifts to the tech industry's trust dilemmas and the fallout from chip bans, comparing the performance of H800 and H100 chips. Innovative AI modeling techniques come to the fore, showcasing advanced strategies and their implications for chip costs. Pricing tactics from major players like OpenAI and Anthropic are dissected, urging listeners to critically assess claims in the ever-evolving tech landscape.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

DeepSeek's Efficiency and Cost Controversy

  • DeepSeek's claimed $5.5M training cost sparked controversy, raising questions about its veracity and implications.
  • Ben Thompson analyzes DeepSeek's efficiency claims, suggesting their plausibility and highlighting potential coping mechanisms in the tech community.
INSIGHT

DeepSeek's Training Cost Explained

  • DeepSeek's claimed training cost excludes R&D and prior runs; it represents the final run's marginal cost.
  • Their mixture of experts approach, load balancing, and low-level optimizations contribute to their efficiency.
INSIGHT

DeepSeek's Inference Efficiency

  • DeepSeek's efficient key-value store compression allows for greater memory efficiency during inference.
  • This addresses the memory limitations often encountered in AI inference, particularly benefiting mobile devices.
Get the Snipd Podcast app to discover more snips from this episode
Get the app