"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis cover image

E48: Mechanizing Mechanistic Interpretability with Arthur Conmy

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

00:00

Automating Mechanistic Interpretability through Algorithmic Approaches

This chapter explores a three-step process essential for mechanistic interpretability in neural networks, focusing on task identification, optimization goals, and the Automatic Circuit Discovery (ACDC) algorithm. It highlights how ACDC automates the analysis of a neural network's causal graph to pinpoint critical components that enhance model performance.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app