AXRP - the AI X-risk Research Podcast cover image

40 - Jason Gross on Compact Proofs and Interpretability

AXRP - the AI X-risk Research Podcast

00:00

Exploring Variations and Simplifications in Neural Network Processing

This chapter focuses on the role of sparse auto-encoders and cross-coders in understanding neural networks. It highlights the significance of identifying irrelevant data variations to simplify explanations of network behavior and discusses ongoing research into feature interactions and their effects on network symmetry.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app