AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Do Gradient Downgraders Make More Use of the Latent Space?
gradient descents are going to make some use of those other 251 dimensions, but they're going to have a very minescule or diminishing effect on the latent space. If i then took that same network and increased the complexity of the problem, we could end up with it sparsifying for any particular class. It might be forced to make more compact use of that latent information space per class. Is that fair? Yes, that's a very good point. And one thing to keep in mind. So let's say your data is evena linar manifold of dimension one, and then you go through a deep and then somehow it's alreadyly nearly separable,. But if you start