"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis cover image

Mechanistic Interpretability: Philosophy, Practice & Progress with Goodfire's Dan Balsam & Tom McGrath

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

00:00

Navigating Paradigms in Neural Networks

This chapter explores the coexistence of parameter decomposition and activation-based approaches in neural networks, shedding light on their respective roles in mechanistic interpretability. It emphasizes the challenges of the alignment problem and the importance of interpretability in understanding model behaviors. Additionally, the discussion reflects on historical techniques, such as sparse autoencoders, and their relevance to modern machine learning practices.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app