How to Train a Multi-Task Model

Multi-task models are what gets us our best performance. We found that adding ALMO to anything is giving you improvements in performance consistently across all of these tasks. Pre-training on some subset of these tasks will generally get you decent performance on the rest of them. And probably unsurprisingly, using attention, using large models really helps you. It's a really interesting set of experiments that you ran it sounds like I need to look at more closely.

Play episode from 27:59

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app