The Gradient: Perspectives on AI cover image

Hattie Zhou: Lottery Tickets and Algorithmic Reasoning in LLMs

The Gradient: Perspectives on AI

CHAPTER

How Do You Set Weights to Zero?

I'm curious about when it comes to the setting weights to zero is what the picture might look like for really large networks or larger networks than you were looking at. One thing I can imagine in line with the idea that perhaps a model over its training evolution perhaps forgets things, but then it also relearns them kind of not so subtly making reference to another paper we're going to talk about soon. At some point in training, maybe a set of weights start converging to zero, but then later in its training dynamics, it realizes oh wait, these were actually important for something or the other.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner