Hear This Idea cover image

#66 – Michael Cohen on Input Tampering in Advanced RL Agents

Hear This Idea

00:00

The Distal Model and the Proximal Model of the World

If the agent never interrupts this information channel, I think you should say they're both equally correct. But in a world where that's on the table, if it does experiment with these, then it would end up favoring one over the other. The model that says I get reward when I win a chess strikes me as way, way simpler than the model that saying I'm on planet earth. This could succeed or fail for different agents differently,. Depending on the way that we're giving it reward and the environment that it's in.

Play episode from 41:04
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app