AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Limits of the Gradients
The problem was first proposed in the original alistean paper, actually. Its just synthetic sequential classific nowis actually recression. The idea here is that you have sequences of arbitrary length and their two dimensional. And so it is just adding them together. So we can use sequence of a hundred, we can use a thousand, and we can make it harder, harder, harder. But for corn we actually had that even in the case for five thousand, you get more or less direct convergence,. which is really nice. I'm so very curious how this, what kind of result you saw, and how it compares from a performance perspective.