
40 - Jason Gross on Compact Proofs and Interpretability
AXRP - the AI X-risk Research Podcast
Exploring Variations and Simplifications in Neural Network Processing
This chapter focuses on the role of sparse auto-encoders and cross-coders in understanding neural networks. It highlights the significance of identifying irrelevant data variations to simplify explanations of network behavior and discusses ongoing research into feature interactions and their effects on network symmetry.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.