
ML 021: Grokking Deep Reinforcement Learning with Miguel Morales
Adventures in Machine Learning
00:00
What Really Happens in Reactor Learning?
The value function is really what are the agents trying to calculate in their brains, right? So you start propagating that value through all of your states and realizing, wait a minute, when I go low in altitude, that is actually pretty bad. That is a little bit dangerous. You should not be doing this thing. It's basically that the rewards start propagating through the states. And then the agent basically starts making sense of, well, from this state, I have a chance to get this reward. What is really the value of this state? The other hand will be like, well,wait a minute. Let me try something different. We're talking about one signal kind of
Transcript
Play full episode