

Richard Sutton – Father of RL thinks LLMs are a dead end
2175 snips Sep 26, 2025
Richard Sutton, a leading researcher in reinforcement learning and 2024 Turing Award winner, argues that large language models (LLMs) are a dead end. He believes LLMs can't learn on-the-job and emphasizes the need for a new architecture enabling continual learning like animals do. The discussion touches on how LLMs perform imitation instead of genuine experiential learning, and why instilling goals is vital for intelligence. Sutton critiques the predictive nature of LLMs, advocating for a future where AI learns from real-world interactions rather than fixed datasets.
AI Snips
Chapters
Transcript
Episode notes
Mimicry Isn't World Modeling
- Richard Sutton argues LLMs mimic human text rather than build true world models for predicting real-world outcomes.
- He says intelligence requires learning from experience and having goals that change the world, which LLMs lack.
Priors Need Ground Truth
- Sutton stresses that a prior only helps if there's an objective ground truth to compare against during life.
- Without goals or ground truth, LLM-style priors cannot support continual on-the-job learning.
Prediction Enables Surprise-Based Learning
- Sutton claims LLMs do not meaningfully predict future events and therefore cannot learn from surprising outcomes.
- He insists prediction of the world's response to actions is essential for adjusting behavior.