
4 - Risks from Learned Optimization with Evan Hubinger
AXRP - the AI X-risk Research Podcast
00:00
Neural Networks
The way in which we purrenly do machine learning is so focused on behaviour a insentives. The problem is that, if you're just looking at the behaviour of the train distbution, you can't really distinguish between a model which like, will be objected ro bust and one with no internal objective. So i think it's very difficult to sort of analyze this question fehavly because fundamentally, we're looking at a problem of unidentified ability. We're looking for situations in which, awe sort of talk about the syla that thereis this, this ispposet by it by bland.
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.