

(Preview) 72 Hours of DeepSeek Hysteria, What DeepSeek Means for Big Tech, Lessons on the Efficacy of Chip Controls
Jan 27, 2025
DeepSeek stirs a whirlwind of reactions, spotlighting its hefty model training costs and transparency issues. The conversation shifts to the tech industry's trust dilemmas and the fallout from chip bans, comparing the performance of H800 and H100 chips. Innovative AI modeling techniques come to the fore, showcasing advanced strategies and their implications for chip costs. Pricing tactics from major players like OpenAI and Anthropic are dissected, urging listeners to critically assess claims in the ever-evolving tech landscape.
AI Snips
Chapters
Transcript
Episode notes
DeepSeek's Efficiency and Cost Controversy
- DeepSeek's claimed $5.5M training cost sparked controversy, raising questions about its veracity and implications.
- Ben Thompson analyzes DeepSeek's efficiency claims, suggesting their plausibility and highlighting potential coping mechanisms in the tech community.
DeepSeek's Training Cost Explained
- DeepSeek's claimed training cost excludes R&D and prior runs; it represents the final run's marginal cost.
- Their mixture of experts approach, load balancing, and low-level optimizations contribute to their efficiency.
DeepSeek's Inference Efficiency
- DeepSeek's efficient key-value store compression allows for greater memory efficiency during inference.
- This addresses the memory limitations often encountered in AI inference, particularly benefiting mobile devices.