
4 - Risks from Learned Optimization with Evan Hubinger
AXRP - the AI X-risk Research Podcast
Proxy Linement Failures
In practice, in many situations, we don't train the zero training error. We have the really complex datuss tat are very difficult to fit completely. In that situation, it' just an identifiabilityand in fact, you can end up in a situation where the inducted biases can be stronger, in some sense, than the a sort of the train bot. If you trains sot ip too far on the train data, sort of perfectly fit, you sort of overfit the train data. But if you stop such that your sorto still have a strong influence on your doctovisis unlike the actual sort of thing that you met up with, then you're in
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.