ACM ByteCast

Andrew Barto and Richard Sutton - Episode 80

Jan 14, 2026
In this engaging discussion, Richard Sutton, a pioneer in reinforcement learning, and Andrew Barto, an influential figure in the same field, share insights on their groundbreaking work. They delve into the origins of reinforcement learning, discussing its ties to neuroscience and psychology. The duo reflects on their notable contributions, like temporal difference learning, and its applications in AI systems such as AlphaGo. They also explore the future of human-RL relationships, emphasizing the importance of safety and ethical considerations in AI development.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Reinforcement Learning As Rediscovered Common Sense

  • Reinforcement learning formalizes the old, commonsense idea of learning from consequences into a computational framework.
  • Andrew G. Barto and Richard Sutton framed that rediscovery as making the idea tractable and prominent in AI.
ANECDOTE

1977 Project That Sparked RL Work

  • Barto described being hired in 1977 to test the unorthodox idea that neurons act like goal-directed agents learning from consequences.
  • That project sparked interdisciplinary exploration and led to computational implementations that revived the field.
INSIGHT

Temporal-Difference Learning Mirrors Biology

  • Temporal-difference (TD) learning uses changes in predictions over time as learning signals rather than waiting for final outcomes.
  • TD later matched dopamine neuron recordings, linking the algorithm to biological reward signals.
Get the Snipd Podcast app to discover more snips from this episode
Get the app