The Inside View cover image

3. Evan Hubinger on Takeoff speeds, Risks from learned optimization & Interpretability

The Inside View

00:00

Transparency Training

We need to train models in such a way that doesn't just look at the model's behavior. We have to use transparence wolls to solve the problem in the first place. And so i'm so in favor of approaches where we we directly train models to sort of using transparent tls. Whereas chris is more like an post mortemyo, you'll see why it isnt. Wi idn't work, ye amso think athing to di a.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app