
101 - The lottery ticket hypothesis, with Jonathan Frankle
NLP Highlights
00:00
Is There a Difference Between Pruning Convolutions and Prunting Fully Connected Layers?
The accuracy of a smaller network is higher than the origin network. One hypothesis might be that, simply, the network has less capacity to overfit. That could be one hypothesis, but it's not something we've investigated rigorously so i wouldn't put that on paper and submit that anywhere right now.
Transcript
Play full episode