
Episode 20: Hattie Zhou, Mila, on supermasks, iterative learning, and fortuitous forgetting
Generally Intelligent
Is It a Dropout Intuition?
It's not clear why it works. I think the original paper thought it might have something to do with learning independent sub-networks because you're resetting each time. And somehow that should be good for generalization. But I couldn't really understand why that should help. You want to induce, in order to induce compositionality, or in order to separate these things, it might be nice to have it more dependent or adaptive, or something besides just this fixed thing. It's cool that it works, though. You have an adult.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.