
Episode 20: Hattie Zhou, Mila, on supermasks, iterative learning, and fortuitous forgetting
Generally Intelligent
00:00
The Story of Coherent Gradients
It's not clear that high gradient coherence really should lead to better generalization, either. I think the story of coherent gradients is very simple. It makes sense, but it's probably not the full story. But we don't know if that's harmful either. Could there be a way that that's beneficial? It's possible.
Transcript
Play full episode