The Gradient: Perspectives on AI cover image

Hattie Zhou: Lottery Tickets and Algorithmic Reasoning in LLMs

The Gradient: Perspectives on AI

CHAPTER

The Relationship Between Training Loss and Grokking the Information Bottleneck Paper

In reading this paper and thinking about the training dynamics of a neural network over time, one connection that kind of arose to me was the overlaps between your work and then other recent work that's studied neural net training phenomena. So I'm curious if in sort of looking at that other work, whether you've thought about the connections between the forget relearn phenomena or double descent.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner