"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis cover image

E48: Mechanizing Mechanistic Interpretability with Arthur Conmy

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

00:00

Unpacking Mechanistic Interpretability in AI

This chapter explores advancements in mechanistic interpretability, highlighting the ACDC library and its applications to generative language models. It discusses the emergence of capabilities in large AI models, emphasizing the complexities of understanding these systems during training and their implications for AI safety. The conversation reflects on the potential of mechanistic interpretability to enhance our understanding of AI behaviors and risks, advocating for further research in this evolving field.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app