Is It Possible to Have an Upstream Corpus That Has No Linguistic Structure?

There's something about the pre training process that gives you sort of an initialization such that like fine tuning from there is smooth sailing. We're not using unsupervised learning to somehow incorporate additional data which we don't have for downstream fine tuning, and we just pre-trained on that. There's no semi-supervised aspect here in a sense that we're not taking advantage of unlabeled data. And so the title our guide tonight, is something like upstream data sets make surprisingly good pre-training models.

Play episode from 40:32

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app