How to Train a Sparse Network?

In your paper, why is the magnitude pruning algorithm the right way to identify these lottery tickets? It seems very simple. So why are these ways that end up large, also good if you trade them in isolation from their initializations? That was the main question I think I wanted to answer. And the other questions are, what about the subnetwork is important? Why is this prepare a subnetwork good for the training process? And could other ways of identifying subnetworks also work? But as we were exploring and finding new things, then new questions came up.

Play episode from 10:52

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app