The Inside View cover image

3. Evan Hubinger on Takeoff speeds, Risks from learned optimization & Interpretability

The Inside View

00:00

How to Apply More Oftinization Power to a Task?

In training processes, values that are easy to specify will win out systematically. This breaks stratg ceiling, because some som people's values that are l easy to specify wil like systematiclly. And so in one way, we might want a sort of alined training process to do is not the sort of systematically better for some of these sorts of values and other values.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app