
Episode 20: Hattie Zhou, Mila, on supermasks, iterative learning, and fortuitous forgetting
Generally Intelligent
00:00
Is There a Better Way to Train Super Masks?
I would love to see better ways of training super masks or just training masks on top of weights. Because I think there's a lot of unexplored use cases for these masks. So you could imagine that this kind of training process might be less prone to overfitting, for example. Like your underlying ways is still coming from a random initialization. And another use case that I've seen some papers do is to use it as a way to probe a trained model. Now instead of looking at a mask on top of randomly initialized weights, you take a trained model and then you try to find a mask that maybe correspond to a certain objective. That's really interesting.
Transcript
Play full episode