Want to Understand Neural Networks? Think Elastic Origami! - Prof. Randall Balestriero

120 snips

Feb 8, 2025

Professor Randall Balestriero, an expert in machine learning, dives deep into neural network geometry and spline theory. He introduces the captivating concept of 'grokking', explaining how prolonged training can enhance adversarial robustness. The discussion also highlights the significance of representing data through splines to improve model design and performance. Additionally, Balestriero explores the geometric implications for large language models in toxicity detection, and delves into the challenges of reconstruction learning and the intricacies of representation in neural networks.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Neural Networks as Elastic Origami

Neural networks with piecewise linear activations act like elastic origami, warping input space.
They partition this space into linear regions, mapping each region to the output.

INSIGHT

Grokking and Adversarial Robustness

Grokking, a delayed generalization phenomenon, appears in general settings, not just specific ones.
Adversarial robustness emerges late in training, even without explicit adversarial training.

INSIGHT

Grokking as Local Decomplexification

Grokking involves a shift from complex to simpler local representations during training.
The network transitions from memorizing training points to focusing on decision boundaries.

Get the Snipd Podcast app to discover more snips from this episode

Get the app