
23 : Peter Sterling, Part 1
Circle of Willis
00:00
Optimal Learning through Reward Prediction Error
The search for resources and the memory of these resources are fundamental aspects in the process of learning. Organisms retain effective solutions through evolution, which is termed as being conserved. The system for rewarding randomly found resources with a reward is the most effective and optimal way to learn, as discovered by Sutton and Barto, known as reward prediction error.
Play episode from 44:26
Transcript


