Is There a Reward Shift in a Novelty Search?

The claim here is that it's just letting loose as much as possible at least any sense of a specific objective. And then instead you're like, hey, do as many different things as possible. Is that kind of the high level idea there? Yeah. You were talking about kind of like reward misspecification and kind of distribution shift, which is kind of getting to AI safety issues that are definitely related. But maybe I would say deception could exist without any of those things in the sense that whatever your goal behavior is, your desired behavior is, there's just very hard to specify just the recognition of that state.

Play episode from 14:44

Transcript

Episode notes

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app