

Mapping the Mind of a Neural Net: Goodfire’s Eric Ho on the Future of Interpretability
134 snips Jul 8, 2025
Eric Ho, founder of Goodfire, is at the forefront of AI interpretability, tackling the challenge of understanding neural networks. He shares breakthroughs in resolving superposition using sparse autoencoders and demonstrates innovative model editing techniques. The conversation touches on real-world applications, particularly in genomics, and the vital role of interpretability as AI grows in influence. Ho emphasizes the importance of independent research in making AI systems more transparent and mitigating risks associated with powerful technologies.
AI Snips
Chapters
Transcript
Episode notes
Necessity of Neural Net Interpretability
- Understanding neural networks is crucial as AI gains mission-critical societal roles.
- Inspecting the model's inner workings enhances safety, power, and reliable AI design beyond black-box evaluation.
Advantage in Mechanistic Interpretability
- Mechanistic interpretability benefits from perfect access to all neural network parameters, unlike neuroscience.
- This advantage allows deeper progress in mapping and understanding large language models' inner workings.
Bonsai Metaphor for AI Design
- Goodfire aims to shape AI like bonsai trees by intentionally growing and pruning behaviors.
- Understanding every piece of training data's effect enables precise, deliberate AI cognition design.