Machine Learning Street Talk (MLST)

Want to Understand Neural Networks? Think Elastic Origami! - Prof. Randall Balestriero

116 snips
Feb 8, 2025
Professor Randall Balestriero, an expert in machine learning, dives deep into neural network geometry and spline theory. He introduces the captivating concept of 'grokking', explaining how prolonged training can enhance adversarial robustness. The discussion also highlights the significance of representing data through splines to improve model design and performance. Additionally, Balestriero explores the geometric implications for large language models in toxicity detection, and delves into the challenges of reconstruction learning and the intricacies of representation in neural networks.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Neural Networks as Elastic Origami

  • Neural networks with piecewise linear activations act like elastic origami, warping input space.
  • They partition this space into linear regions, mapping each region to the output.
INSIGHT

Grokking and Adversarial Robustness

  • Grokking, a delayed generalization phenomenon, appears in general settings, not just specific ones.
  • Adversarial robustness emerges late in training, even without explicit adversarial training.
INSIGHT

Grokking as Local Decomplexification

  • Grokking involves a shift from complex to simpler local representations during training.
  • The network transitions from memorizing training points to focusing on decision boundaries.
Get the Snipd Podcast app to discover more snips from this episode
Get the app