AXRP - the AI X-risk Research Podcast cover image

4 - Risks from Learned Optimization with Evan Hubinger

AXRP - the AI X-risk Research Podcast

00:00

Will It Generalize Properly to This New Datapoint?

A lot of times people read a paper and think, what is going on with this sort of deployment distribution? Sometimes we just do. Even if you're doing a deployment where you're not doing on my learning, if you notice that the model's doing something bad, there are still feeback mechanisms. You can trying to shut it down. But fundamental generalization problem is not changed by the fact that after it produces some action on thi data point,. We'll go back and we'll do some additional training on that if there's still a question of how hi's going to generalize this new datao. And so, you know, an example of a situation like this,

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app