AXRP - the AI X-risk Research Podcast cover image

11 - Attainable Utility and Power with Alex Turner

AXRP - the AI X-risk Research Podcast

00:00

How to Preserve the Utility of a Reward Function, Right?

Ri: It seems like you've got to get the space of reward functions approximately right for randaly sampling them to end up being useful. The actually useful objective isn't going to be a function of the whole world's day and like the temperature and the pressure and whatever other statistics, but it's like chunking things nnot objects, and like featurization and such. And i think if, well, it's true that you could get some rather strange auxiliary golfs,. I think that just using the same forment that the primary reward isn't, should generally work pretty well.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app