The Gradient: Perspectives on AI cover image

Hugo Larochelle: Deep Learning as Science

The Gradient: Perspectives on AI

00:00

The Importance of Pre-Training Datasets

A lot of interesting I think connections and lessons here so I guess in head to toe the way you're articulated what you were doing did remind me a lot of the convex combination of pre-trained weights and the few shot data generalization paper. In one of the other papers a universal representation transformer layer you and the other authors used exactly the term universal representation which I find very evocative. It speaks to a larger question which is yeah what is the right pre-training dataset for that model? We need to be more clever about how to influence the inductive principle of these kinds of models.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app