
On Dwarkesh Patel's Podcast With Richard Sutton
Don't Worry About the Vase Podcast
00:00
Temporal Difference Learning and Policy Specialization
They examine TD learning, finding intermediate objectives, and how continual learning produces policies specialized to an environment.
Play episode from 23:57
Transcript


