The Delphi Podcast

Sam Lehman: What the Reinforcement Learning Renaissance Means for Decentralized AI

Apr 30, 2025
Sam Lehman, Principal at Symbolic Capital and AI researcher, dives into the resurgence of Reinforcement Learning (RL) and its impact on decentralized AI. He discusses the evolution of AI scaling and highlights the importance of open, collaborative environments for fostering diverse AI strategies. Lehman critiques the potential limitations of human preference data on model creativity and advocates for the 'World's RL Gym' concept. The conversation touches on future AI architectures, the challenges of lock-in in proprietary systems, and the exciting potential of continuous learning.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Three Phases of AI Scaling

  • AI scaling history has three phases: pre-training, inference time compute, and reinforcement learning (RL) renaissance.
  • The RL phase lets models think longer during inference to improve performance dramatically.
INSIGHT

Reasoning Models Think Longer

  • Reasoning models outperform larger models by using more compute during inference to think longer.
  • This inference-time compute scaling is a new lever for boosting model performance.
ANECDOTE

DeepMind's Novel RL Breakthrough

  • DeepMind used a novel RL approach (GRPO) with minimal human data to create powerful reasoning models.
  • This breakthrough showed RL can elicit complex reasoning with limited human intervention.
Get the Snipd Podcast app to discover more snips from this episode
Get the app