

Interconnects
Nathan Lambert
Audio essays about the latest developments in AI and interviews with leading scientists in the field. Breaking the hype, understanding what's under the hood, and telling stories. www.interconnects.ai
Episodes
Mentioned books

16 snips
Mar 5, 2025 • 14min
Where inference-time scaling pushes the market for AI companies
The discussion dives into the unsustainable costs associated with providing free AI models to users. It highlights insights on GPT-4.5's model launch and the implications of inference-time computing. The conversation covers how profitability may stem from advertising as serving costs approach zero. Aggregation Theory is examined, shedding light on how a few companies could dominate the AI market by aggregating user demand. Proponents argue this could pave the way for a new era of successful, user-facing AI businesses.

7 snips
Feb 28, 2025 • 10min
GPT-4.5: "Not a frontier model"?
The discussion kicks off with the intriguing release of GPT-4.5 and its unusual classification as not a frontier model. Experts ponder the economic implications and community expectations tied to AI scaling. They also tackle the subtle but significant improvements this model brings compared to its predecessors. As they navigate the evolving landscape, the conversation highlights how GPT-4.5 could reshape future AI developments. Listeners will find insights about the challenges in distinguishing real advancements from perceived improvements.

6 snips
Feb 26, 2025 • 12min
Character training: Understanding and crafting a language model's personality
Delve into the intricate world of character training for AI language models. Discover the distinction between public evaluations and the internal assessments that drive real progress. Learn how leading labs are sculpting models like GPT-4 to enhance user interactions. Uncover the challenges of creating human-like traits in AI without sacrificing reliability. Join the conversation on the importance of crafting distinct personalities within models, an essential yet largely overlooked aspect of post-training.

18 snips
Feb 24, 2025 • 10min
Claude 3.7 thonks and what's next for inference-time scaling
Anthropic unveils Claude 3.7 Sonnet, improving inference time with a focus on reinforcement learning. This model enhances coding capabilities with a new command line tool, Claude Code. It breaks ground in software development benchmarks, outshining previous iterations. While not a revolutionary leap, the steady evolution of these models signals promising trends for AI. As the cost of superhuman coding solutions decreases, developers can anticipate exciting, cumulative enhancements throughout the year.

12 snips
Feb 18, 2025 • 12min
Grok 3 and an accelerating AI roadmap
The launch of Grok 3 has shaken up the AI landscape, signaling a shift towards faster model updates. With daily improvements on the horizon, the era of waiting for new releases may be over. The podcast discusses the competitive dynamics among AI developers like DeepSeek and Grok, highlighting the importance of transparency in evaluating AI's capabilities. It also addresses the real-world utility of AI, stressing user-centric progress as the industry gears up for 2025.

31 snips
Feb 13, 2025 • 40min
An unexpected RL Renaissance
Reinforcement learning is experiencing a renaissance, fueled by advanced research and improved infrastructure. The impact of training from human feedback has transformed language models, reshaping AI capabilities. Exciting new tools like TRL and OpenRLHF are making it easier to train innovative models. The evolution of techniques such as DeepRL is paving the way for scalable, adaptable AI. With a wealth of funding and open-source resources, the future of reinforcement learning promises to be both dynamic and groundbreaking.

19 snips
Feb 12, 2025 • 14min
Deep Research, information vs. insight, and the nature of science
Explore the fascinating intersection of deep research and AI's role in transforming science. The discussion contrasts mere information gathering with the pursuit of genuine insights. Hear about how pioneering tools like AlphaFold are reshaping scientific practices. The podcast emphasizes the need for scientists to adapt to this AI-driven landscape. Finally, it delves into the implications of embracing AI within scientific paradigms, promoting collaboration and innovation in research strategies.

6 snips
Feb 5, 2025 • 16min
Making the U.S. the home for open-source AI
Explore the evolving landscape of open-source AI and its ideological debates. The discussion highlights the challenges of building a sustainable ecosystem amid competitive pressures from major players, particularly in the U.S. and China. Discover the significance of DeepSeek, which reshapes narratives surrounding open versus closed AI models. Delve into the vision of a future where AI is more accessible, safer, and collaboratively built by a broader community, pushing back against the dominance of super-rich companies.

27 snips
Jan 28, 2025 • 12min
Why reasoning models will generalize
Explore the fascinating evolution of reasoning models in AI, highlighting their potential to generalize beyond traditional domains like programming and math. Discover how chain of thought reasoning enhances performance, allowing models to manage complexity more effectively. The discussion touches on advancements in training methodologies and the future capabilities expected by 2025. The differences in reasoning between human intelligence and language models provide intriguing insights into how information is processed and stored.

11 snips
Jan 22, 2025 • 1h 13min
Interviewing OLMo 2 leads: Open secrets of training language models
Luca Soldaini, the Data lead for the OLMo project at AI2, joins the discussion to unveil the intricacies of training language models. He shares tales of overcoming challenges in pretraining efficiency and the quest for stability, especially after a significant 70B model attempt. The conversation dives into the strategic decisions behind building effective language modeling teams, the intricate balance of deep versus wide network architectures, and the importance of community-driven advancements in AI.


