The Challenges of Scaling Up Bigger Models

There's rumors now that Microsoft is in talks to $10 billion into OpenAI and presumably a big chunk of that will be applied to compute as they try to scale up the size of the models even further. When you go from GPT-2 to GPT-3, what are the challenges that you face as an organization that's trying to train these really large models? I assume it's not as simple as going to those who are saying, hey, here's more data, train a bigger model. What're the complexities that companies face when they actually get to do that? We've literally hit the limits of everything that's on the Internet. It's kind of an open question

Transcript

Play full episode

Transcript

Episode notes

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app