

605: The Democrats Behind DeepSeek
Jan 29, 2025
DeepSeek is stirring both intrigue and concern, revealing what’s genuinely noteworthy and what’s simply overblown. The discussion highlights a staggering shift away from Oracle Java, with 88% of companies seeking alternatives. As AI innovation heats up, tensions rise between American and Chinese tech competitors amid Nvidia's massive market drop. Corporate policies are also scrutinized, particularly in the realm of prompt engineering and GitHub Actions, reflecting the quirky absurdities of navigating modern tech landscapes.
AI Snips
Chapters
Transcript
Episode notes
Market Fragility Exposed By DeepSeek
- A single open-source model release can trigger massive market reactions because investors had concentrated expectations on a few AI winners.
- The NVIDIA crash shows how fragile valuations are when a cheap competitor changes perceived future profits.
Architecture Cuts Model Costs Dramatically
- DeepSeek V2 introduced Mixture-of-Experts and Multi-head Latent Attention to cut inference memory and activate only relevant parameters.
- Those architectural changes make large models much cheaper and more efficient to run and train.
Reported $6M Training Cost Is A Game Changer
- DeepSeek reports ~2.788M H800 GPU hours and claims roughly $5.6M training cost for R3, highlighting dramatic cost reductions versus prior industry estimates.
- That $6M figure excludes prior R&D and earlier model versions but still signals materially lower marginal training costs.