“Gradient Routing: Masking Gradients to Localize Computation in Neural Networks” by cloud, Jacob G-W, Evzen, Joseph Miller, TurnTrout

Dec 9, 2024

Dive into the fascinating world of gradient routing, a technique that controls learning in neural networks by applying masks to gradients. Discover how it can lead to safer AI systems by enabling transparency and oversight. Learn about its implementation in splitting latent spaces for distinct digit recognition and the localization of computation in language models. The discussion also touches on robust unlearning and the importance of scalable oversight, showcasing the potential of specialized AI.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Gradient Routing

Gradient routing controls where learning happens in neural networks by masking gradients during backpropagation.
Different masks for different data points create specialized subcomponents within a model.

ANECDOTE

MNIST Latent Space Splitting

An MNIST autoencoder was trained with gradient routing to split its latent space.
Digits 0-4 were routed through one half, and 5-9 through the other, demonstrating specialization.

ANECDOTE

Steering Scalar

Routing the token "California" to a specific dimension in a language model localized related features.
This showed that gradient routing can steer the learning of specific features in a desired direction.

Get the Snipd Podcast app to discover more snips from this episode

Get the app