AXRP - the AI X-risk Research Podcast cover image

4 - Risks from Learned Optimization with Evan Hubinger

AXRP - the AI X-risk Research Podcast

00:00

Neural Networks

The way in which we purrenly do machine learning is so focused on behaviour a insentives. The problem is that, if you're just looking at the behaviour of the train distbution, you can't really distinguish between a model which like, will be objected ro bust and one with no internal objective. So i think it's very difficult to sort of analyze this question fehavly because fundamentally, we're looking at a problem of unidentified ability. We're looking for situations in which, awe sort of talk about the syla that thereis this, this ispposet by it by bland.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app