
4 - Risks from Learned Optimization with Evan Hubinger
AXRP - the AI X-risk Research Podcast
Neural Networks
The way in which we purrenly do machine learning is so focused on behaviour a insentives. The problem is that, if you're just looking at the behaviour of the train distbution, you can't really distinguish between a model which like, will be objected ro bust and one with no internal objective. So i think it's very difficult to sort of analyze this question fehavly because fundamentally, we're looking at a problem of unidentified ability. We're looking for situations in which, awe sort of talk about the syla that thereis this, this ispposet by it by bland.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.