
Goodhart's Law in Reinforcement Learning
Data Skeptic
00:00
A Simple Causal Diagram of a Dog's Behavior
I was hoping we could break down a little bit of the problem. I know you're using an m d p marko decision process, which is defined by a very convenient couple. Would you mind telling us a little abit about the states and actions and such? Sometimes wit states for the model, there's a precious state which pressure the predictor for weather. And then there's a barometer state, so the barometer state is only dependent on pressure.
Transcript
Play full episode