
101 - The lottery ticket hypothesis, with Jonathan Frankle
NLP Highlights
00:00
How Does Pruning Work?
In the original paper we focussed on small vision networks, so small, fully connected networks and convolutional networks for mnist and cfar ten. In follow up work, we've extended this to large scale convolutional residual networks for image net. The conclusion is the same, that you don't need to do the pruning per layer that's correct. We found in practice at the network when you prune in this way, it seems to find good proportions in which to prune each layer. And we can actually get better sparsey numbers than state of the art, just by pruning globally, rather than trying to be too careful about which layers we prune.
Transcript
Play full episode