
Episode 20: Hattie Zhou, Mila, on supermasks, iterative learning, and fortuitous forgetting
Generally Intelligent
Compositional and Trans-Horror Red Scale Part 2
I think what's interesting for me in the sort of like compositional and trans-horror red scale part is that we are seeing some signs of some compositionality as we scaled them up. I haven't fully changed my opinion, but it's shifting. My hunch is that if you're given a data set with enough diversity such that a compositional representation becomes the most efficient representation to represent the entire data set, then it emerges automatically. But that's usually not the case given a limited data set because probably there are entangled concepts to exploit here, right? That's interesting. It's similar to the iterated learning like, compositional language idea, where like, the information bottleneck