AXRP - the AI X-risk Research Podcast cover image

4 - Risks from Learned Optimization with Evan Hubinger

AXRP - the AI X-risk Research Podcast

00:00

How Do You Get a Situation Where Your System Cares About Multile Episodes?

How do you ever get a situation where your system cares about what happens after the gradients are applied? Yes, that's a great question. And in fact, this is actually really close to a lot of the research that iam currently doing,. trying to understand precisely the situation in which you do you get soof these processes s and other suutes rations, you can produce one of myopic objectives. So by default, we should basically expect it's going to try to get those other red things. To that like it would require a tional sort of a distinction in the model, sort of understanding of the world, to believe that only these re things the ones that matter,

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app