AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
How to Train a Base Model for Semantic Segmentation
I think we're taking less than 10% of flops in terms of training time because of the sparsity of the classifier and again also because the probe is sparse. I don't see a reason why it wouldn't be applicable to regression it would be interesting to see if it could be applicable to things like semantic segmentation or other things like that so this points potentially interesting follow-up work for sure. We found often that the first hidden layer the one that's closest to the input was fairly frequently useful and was particularly useful when the downstream task was very different from the pre-training tasks. The top embedding was pretty good already for solving this pets downstream task but if