Data Skeptic cover image

Goodhart's Law in Reinforcement Learning

Data Skeptic

00:00

A Simple Causal Diagram of a Dog's Behavior

I was hoping we could break down a little bit of the problem. I know you're using an m d p marko decision process, which is defined by a very convenient couple. Would you mind telling us a little abit about the states and actions and such? Sometimes wit states for the model, there's a precious state which pressure the predictor for weather. And then there's a barometer state, so the barometer state is only dependent on pressure.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app