Richard Sutton – Father of RL thinks LLMs are a dead end

2817 snips

Sep 26, 2025

Richard Sutton, a leading researcher in reinforcement learning and 2024 Turing Award winner, argues that large language models (LLMs) are a dead end. He believes LLMs can't learn on-the-job and emphasizes the need for a new architecture enabling continual learning like animals do. The discussion touches on how LLMs perform imitation instead of genuine experiential learning, and why instilling goals is vital for intelligence. Sutton critiques the predictive nature of LLMs, advocating for a future where AI learns from real-world interactions rather than fixed datasets.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Mimicry Isn't World Modeling

Richard Sutton argues LLMs mimic human text rather than build true world models for predicting real-world outcomes.
He says intelligence requires learning from experience and having goals that change the world, which LLMs lack.

INSIGHT

Priors Need Ground Truth

Sutton stresses that a prior only helps if there's an objective ground truth to compare against during life.
Without goals or ground truth, LLM-style priors cannot support continual on-the-job learning.

INSIGHT

Prediction Enables Surprise-Based Learning

Sutton claims LLMs do not meaningfully predict future events and therefore cannot learn from surprising outcomes.
He insists prediction of the world's response to actions is essential for adjusting behavior.

Get the Snipd Podcast app to discover more snips from this episode

Get the app

Richard Sutton is the father of reinforcement learning, winner of the 2024 Turing Award, and author of The Bitter Lesson. And he thinks LLMs are a dead end.

After interviewing him, my steel man of Richard’s position is this: LLMs aren’t capable of learning on-the-job, so no matter how much we scale, we’ll need some new architecture to enable continual learning.

And once we have it, we won’t need a special training phase — the agent will just learn on-the-fly, like all humans, and indeed, like all animals.

This new paradigm will render our current approach with LLMs obsolete.

In our interview, I did my best to represent the view that LLMs might function as the foundation on which experiential learning can happen… Some sparks flew.

A big thanks to the Alberta Machine Intelligence Institute for inviting me up to Edmonton and for letting me use their studio and equipment.

Enjoy!

Watch on YouTube; listen on Apple Podcasts or Spotify.