On Dwarkesh Patel's Podcast With Richard Sutton

Sep 29, 2025

Richard Sutton, a pioneer in AI research best known for his work on reinforcement learning, dives deep into the limitations of large language models (LLMs). He argues that LLMs mimic human behavior without true understanding and emphasizes the importance of goal-directed learning. The conversation explores themes such as continual learning, imitation versus experience, and the implications of Sutton's Bitter Lesson for future AI developments. Sutton also discusses the potential risks of merging experiences between AI copies and the need for steering AI towards positive outcomes.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Mimicry ≠ World Understanding

Richard Sutton argues LLMs only mimic human tokens and thus lack true action-driven world understanding.
He claims intelligence requires learning from real-world experience, not just next-token prediction.

INSIGHT

Experience Versus Logged Examples

Sutton contrasts continual experiential learning with LLM training on observed human behavior examples.
He allows LLMs could use external memory or programs, but current forms lack continual learning from an agent's own life.

ADVICE

Incorporate Continual Learning

Use continual learning and online feedback to build systems that adapt during interactions.
Don’t rely solely on static next-token objectives if you want systems to update from real responses.

Get the Snipd Podcast app to discover more snips from this episode

Get the app