AXRP - the AI X-risk Research Podcast cover image

4 - Risks from Learned Optimization with Evan Hubinger

AXRP - the AI X-risk Research Podcast

00:00

Neural Networks

The way in which we purrenly do machine learning is so focused on behaviour a insentives. The problem is that, if you're just looking at the behaviour of the train distbution, you can't really distinguish between a model which like, will be objected ro bust and one with no internal objective. So i think it's very difficult to sort of analyze this question fehavly because fundamentally, we're looking at a problem of unidentified ability. We're looking for situations in which, awe sort of talk about the syla that thereis this, this ispposet by it by bland.

Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner