Reslope: An Inverse Reinforcement Learning Approach

In inverse reinforcement learning, I assume that the agent who is executing this behavior is sort of near optimal for some reward function. And then I try to reverse engineer what that reward function was. In reslope, the data with which we're doing this is this reward that you only get at the end. So if you take a lot of standard reinforcement learning algorithms and you force them to only observe reward at the end rather than observe incremental reward as they go along, it makes the problem much harder.

Play episode from 41:21

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app