AXRP - the AI X-risk Research Podcast cover image

21 - Interpretability for Engineers with Stephen Casper

AXRP - the AI X-risk Research Podcast

00:00

The Importance of Interpretability in Safe AI

I think there are a few levels in which interpretability can be useful. For example, you could use interpretability tools to determine legal accountability. But that's probably not going to be the kind of thing that saves us all someday. From an AI safety perspective, I think interpretability is just kind of good in general for finding bugs and guiding the fixing of these bugs.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app