Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead end

2175 snips
Sep 26, 2025
Richard Sutton, a leading researcher in reinforcement learning and 2024 Turing Award winner, argues that large language models (LLMs) are a dead end. He believes LLMs can't learn on-the-job and emphasizes the need for a new architecture enabling continual learning like animals do. The discussion touches on how LLMs perform imitation instead of genuine experiential learning, and why instilling goals is vital for intelligence. Sutton critiques the predictive nature of LLMs, advocating for a future where AI learns from real-world interactions rather than fixed datasets.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Mimicry Isn't World Modeling

  • Richard Sutton argues LLMs mimic human text rather than build true world models for predicting real-world outcomes.
  • He says intelligence requires learning from experience and having goals that change the world, which LLMs lack.
INSIGHT

Priors Need Ground Truth

  • Sutton stresses that a prior only helps if there's an objective ground truth to compare against during life.
  • Without goals or ground truth, LLM-style priors cannot support continual on-the-job learning.
INSIGHT

Prediction Enables Surprise-Based Learning

  • Sutton claims LLMs do not meaningfully predict future events and therefore cannot learn from surprising outcomes.
  • He insists prediction of the world's response to actions is essential for adjusting behavior.
Get the Snipd Podcast app to discover more snips from this episode
Get the app