Last Week in AI

#227 - Jeremie is back! DeepSeek 3.2, TPUs, Nested Learning

331 snips
Dec 9, 2025
Explore the latest in open-source AI with DeepSeek 3.2, now faster and cheaper than ever. Amazon's new AI chips and Google's TPUs spark competition against NVIDIA in the hardware market. Discover Anthropic's potential IPO amidst significant fundraising moves and OpenAI's internal 'Code Red' response to increasing market pressures. Research on multi-agent systems and nested learning reveals exciting advancements in AI reasoning. And don't miss the discussion on Microsoft's adjusting sales targets following missed quotas!
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Sparse Attention Selects Key Tokens

  • DeepSeek sparse attention uses a lightweight indexer to keep ~2,000 high-value tokens and discard the rest to save compute.
  • This approximation preserves long-context performance because relevant information is sparse.
INSIGHT

RL Scale And Stability Drive Reasoning Gains

  • DeepSeek dedicates ~10% of total training compute to RL and uses stability tricks like off-policy sequence masking and 'keep routing' to avoid training instability.
  • Specialist distillation and mixed RL reduce catastrophic forgetting across skills.
INSIGHT

New Open Option In Image Generation

  • Flux 2 from Black Forest Labs brings high-quality, cheaper image generation into the open-source ecosystem with multiple variants.
  • It competes with Nano Banana Pro on cost-performance and expands open alternatives for image work.
Get the Snipd Podcast app to discover more snips from this episode
Get the app