AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Can You Do Better Than Just Combining the Two Kinds of Data?
The captioned images have a long tailed distribution over objects, so they may not cover all the categories that you want to cover. So if we wanted to do that for pre training, it turns out, that's what we look at in this paper. It turns out that you can do better than just combining the two kinds of data. Anden when you train on both the fate captions and the real captions, you do get a stronger model. You get a more powerful model that combines the knowledge from these two kinds ofData augmentation typer approache that rat yes, exactly.