The Inside View cover image

5. Charlie Snell on DALL-E and CLIP

The Inside View

00:00

Scaling Laws

JB3 requires less training samples to get like to a lower loss or the same loss. People might say GPT three is overfitting, but in a technical sense, it really can't overfit because it didn't even see all the data points more than one. So yeah, that was something that impressed me about scaling laws. I would say I'm generally pro scaling. There's a difference between being pro scaling and being optimistic about scaling.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app