
Episode 20: Hattie Zhou, Mila, on supermasks, iterative learning, and fortuitous forgetting
Generally Intelligent
00:00
Is It a Dropout Intuition?
It's not clear why it works. I think the original paper thought it might have something to do with learning independent sub-networks because you're resetting each time. And somehow that should be good for generalization. But I couldn't really understand why that should help. You want to induce, in order to induce compositionality, or in order to separate these things, it might be nice to have it more dependent or adaptive, or something besides just this fixed thing. It's cool that it works, though. You have an adult.
Transcript
Play full episode