Future of Life Institute Podcast cover image

Ajeya Cotra on how Artificial Intelligence Could Cause Catastrophe

Future of Life Institute Podcast

CHAPTER

The Future of Interpretability

interpretability plays a pretty big role right now I think we don't have super great interpretability abilities. Holden Karnofsky has a blog post out on how we might align transformative AI that's built really soon and it discusses a number of ideas. We could train models to kind of distill what that interpretability says you know if we have some slow procedure for looking at a model's weights with a bunch of humans then we can potentially dramatically speed up interpretability. That's it for this episode on the next episode I talk with Ajaya about how to think clearly in fast-moving worlds whether the pace of change is accelerating or not.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner