
Episode 20: Hattie Zhou, Mila, on supermasks, iterative learning, and fortuitous forgetting
Generally Intelligent
00:00
Is There a Sparse Architecture in Computer Vision?
Florida's language models, which generalize at least somewhat in terms of representation learning seem like they generalize quite well. So I think you're saying we might not get a better resnet out of this by training it on ImageNet, for example. Because what about like data? But what about like Florida's language models and vision transformers? Yeah. There could be a just based on the complexity of the data set versus the passive of the model.
Transcript
Play full episode