The MAD Podcast with Matt Turck

Can America Win the Open Source AI Race? — Olmo 3 with Ai2’s Nathan Lambert & Luca Soldaini

Nov 20, 2025
Nathan Lambert and Luca Soldaini from AI2 dive into the groundbreaking OLMo 3 release, showcasing their approach to open-source AI with full transparency. They discuss the significance of releasing comprehensive model data and the distinction between base, instruct, and thinking models. The conversation highlights the impact of Meta's retreat from the open-source space, leading to the rise of Chinese models. Nathan and Luca also explore the challenges of reasoning in AI, emphasizing the need for U.S. innovation and broader engagement in shaping AI's future.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Full Openness Enables Reproducibility

  • AI2 releases Olmo 3 as full openness: models, data, recipes, and checkpoints for reproducibility.
  • This helps researchers customize, continue pre-training, and study intermediate behavior.
INSIGHT

Multiple Model Flavors For Different Needs

  • Olmo 3 includes base (7B, 32B), instruct, and thinking models targeting various use cases and latencies.
  • The thinking models spend extra inference compute to generate long chains of thought for harder tasks.
INSIGHT

Curated Massive Dataset With Long Documents

  • Dolma 3 pre-training pool stems from ~10T tokens with sampled ~6T tokens and selective repeats for high-value documents.
  • AI2 also opens ~600B tokens of long PDFs (>8k tokens) to boost long-context capabilities.
Get the Snipd Podcast app to discover more snips from this episode
Get the app