AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Is There a Way to Distilate Knowledge From One Location to Another?
Convolutional lets do that, and transformers do that too. I don't think the brain can do that, because that would involve weight sharing - it would involve doing exactly the same computation at each locality so you can use the same weights. But actually, there's a way to achieve what weight sharing does next to convolution in a much more plausible way than i think people have suggested before. If you do have contextural predictions trying to agree with locally extracted things, then imagine a whole bunch of columns that are making local predictions and looking at nearby columns to get their contextural prediction. You can think of the text as a teacher for the local thing, but also vice versa