The Relationship Between Training Loss and Grokking the Information Bottleneck Paper

In reading this paper and thinking about the training dynamics of a neural network over time, one connection that kind of arose to me was the overlaps between your work and then other recent work that's studied neural net training phenomena. So I'm curious if in sort of looking at that other work, whether you've thought about the connections between the forget relearn phenomena or double descent.

Play episode from 53:48

Transcript

Episode notes

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app