Towards Data Science cover image

120. Liam Fedus and Barrett Zoph - AI scaling with mixture of expert models

Towards Data Science

00:00

Is There a Trade Off Between Instability and Pretraining Quality?

When we first encountered these instability issues, we tried a really, kind of wide variety of techniques to try to fix the instabilities. We did notice this very clear trade off between techniques that would fix tho instability, but hurt the quality. And yes, ut why does this exist? I think is a pretty, is it a good open question almost.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app