How to Scale Up a Language Model

With today's models, I guess this really depends on where you are and if you actually have any real use for them. But a lot of the advances we've seen are just from that scaling hypothesis, take the big language model and make it even bigger. The work on making transformers smaller is more interesting. And then when you're speaking about two, I think these packaging questions, I feel like even a lot of people in ML who may be familiar with data and model parallel haven't thought about those quite as much.

Play episode from 21:54

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app