The Gradient: Perspectives on AI cover image

Hattie Zhou: Lottery Tickets and Algorithmic Reasoning in LLMs

The Gradient: Perspectives on AI

00:00

The Relationship Between Training Loss and Grokking the Information Bottleneck Paper

In reading this paper and thinking about the training dynamics of a neural network over time, one connection that kind of arose to me was the overlaps between your work and then other recent work that's studied neural net training phenomena. So I'm curious if in sort of looking at that other work, whether you've thought about the connections between the forget relearn phenomena or double descent.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app