Hear This Idea cover image

#66 – Michael Cohen on Input Tampering in Advanced RL Agents

Hear This Idea

00:00

Reward Is Not the Optimization Target

There's just no way around the fact that in, for example, this video game environment and so therefore in environments in general, you sometimes need to learn on the fly what your goal is about. So the second thing I'll say about those ideas that reward is not the optimization target. The simplest argument, which I think would require pretty tremendous evidence to discard is if you're going to be maximizing reward way better than any person. Is that really going to happen without trying? Now you can have a narrowly trained system where assumption three fails, where the proximal model is considered way more complex than the distal model. And so then that helps us figure out how we can arrange for this

Play episode from 01:26:53
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app