
Episode 20: Hattie Zhou, Mila, on supermasks, iterative learning, and fortuitous forgetting
Generally Intelligent
The Later Layers Are Learning of the Tile of All Features
When you're removing half of the weights, there's some work on the later layers of things. The later layers are sort of learning of the tile of all features. So it can get rid of some of those features. Then the next time you build them again, you get to take advantage of this better representation so you now have later in the training process when you're making them.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.