Keys and Chests: A Natural Experiment

There's no, there's nothing in the training setup that tells you whether the wall or the coin is the goal. So it's just totally unfair to the agent in a way to say you're misgeneralizing. But I think that is reflective of like the real world is in general things are underspecified. And we do a number of other experiments in that paper as well, which show that this can happen to a much lesser extent,. Even if you do randomize the coin location somewhat so the correlation doesn't have to be perfect.

Transcript

Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app