Open Source AI Strikes Back — Inside Ai2’s OLMo 3 ‘Thinking"

The MAD Podcast with Matt Turck

chevron_right

00:00

RLVR: Reinforcement with Verifiable Rewards

Nathan outlines RLVR vs RLHF: using verifiable correctness (e.g., tests) as rewards for stable optimization.

Play episode from 01:10:51

chevron_right

Transcript

chevron_right

Transcript

Episode notes

In this special release episode, Matt sits down with Nathan Lambert and Luca Soldaini from Ai2 (the Allen Institute for AI) to break down one of the biggest open-source AI drops of the year: OLMo 3. At a moment when most labs are offering “open weights” and calling it a day, AI2 is doing the opposite — publishing the models, the data, the recipes, and every intermediate checkpoint that shows how the system was built. It’s an unusually transparent look into the inner machinery of a modern frontier-class model.

Nathan and Luca walk us through the full pipeline — from pre-training and mid-training to long-context extension, SFT, preference tuning, and RLVR. They also explain what a thinking model actually is, why reasoning models have exploded in 2025, and how distillation from DeepSeek and Qwen reasoning models works in practice. If you’ve been trying to truly understand the “RL + reasoning” era of LLMs, this is the clearest explanation you’ll hear.

We widen the lens to the global picture: why Meta’s retreat from open source created a “vacuum of influence,” how Chinese labs like Qwen, DeepSeek, Kimi, and Moonshot surged into that gap, and why so many U.S. companies are quietly building on Chinese open models today. Nathan and Luca offer a grounded, insider view of whether America can mount an effective open-source response — and what that response needs to look like.

Finally, we talk about where AI is actually heading. Not the hype, not the doom — but the messy engineering reality behind modern model training, the complexity tax that slows progress, and why the transformation between now and 2030 may be dramatic without ever delivering a single “AGI moment.” If you care about the future of open models and the global AI landscape, this is an essential conversation.

Allen Institute for AI (AI2)

Website - https://allenai.org

X/Twitter - https://x.com/allen_ai

Nathan Lambert

Blog - https://www.interconnects.ai

LinkedIn - https://www.linkedin.com/in/natolambert/

X/Twitter - https://x.com/natolambert

Luca Soldaini

Blog - https://soldaini.net

LinkedIn - https://www.linkedin.com/in/soldni/

X/Twitter - https://x.com/soldni

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

Blog - https://mattturck.com

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

(00:00) – Cold Open

(00:39) – Welcome & today’s big announcement

(01:18) – Introducing the Olmo 3 model family

(02:07) – What “base models” really are (and why they matter)

(05:51) – Dolma 3: the data behind Olmo 3

(08:06) – Performance vs Qwen, Gemma, DeepSeek

(10:28) – What true open source means (and why it’s rare)

(12:51) – Intermediate checkpoints, transparency, and why AI2 publishes everything

(16:37) – Why Qwen is everywhere (including U.S. startups)

(18:31) – Why Chinese labs go open source (and why U.S. labs don’t)

(20:28) – Inside ATOM: the U.S. response to China’s model surge

(22:13) – The rise of “thinking models” and inference-time scaling

(35:58) – The full Olmo pipeline, explained simply

(46:52) – Pre-training: data, scale, and avoiding catastrophic spikes

(50:27) – Mid-training (tail patching) and avoiding test leakage

(52:06) – Why long-context training matters

(55:28) – SFT: building the foundation for reasoning

(1:04:53) – Preference tuning & why DPO still works

(1:10:51) – The hard part: RLVR, long reasoning chains, and infrastructure pain

(1:13:59) – Why RL is so technically brutal

(1:18:17) – Complexity tax vs AGI hype

(1:21:58) – How everyone can contribute to the future of AI

(1:27:26) – Closing thoughts

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app

Home Top podcasts Popular guests Top books