AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Relationship Between Pre-Trained Models and Fine Tuning Models
The idea that some kind of general model could be better than fine tune on a particular data distribution doesn't really make sense to me, but I guess it does. So if you fine tune the model and then you have a weight space ensemble along the path from the fine tune model from the original starting point,. That is going to be a very successful way to trade off the out of distribution, the generalization, the flexibility of these large pre-trained models. Then if we extra fine tune it on these, on specific data sets, of course, fine tuning is always better, but that's not always an option.