
Episode 20: Hattie Zhou, Mila, on supermasks, iterative learning, and fortuitous forgetting
Generally Intelligent
How to Train a Sparse Network?
In your paper, why is the magnitude pruning algorithm the right way to identify these lottery tickets? It seems very simple. So why are these ways that end up large, also good if you trade them in isolation from their initializations? That was the main question I think I wanted to answer. And the other questions are, what about the subnetwork is important? Why is this prepare a subnetwork good for the training process? And could other ways of identifying subnetworks also work? But as we were exploring and finding new things, then new questions came up.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.