The Future of Data and Model Scaling

It is surprising how well these library models absorb information from the pre-training data. I'd say that any given fact will appear in many different documents on the Internet, and if it's only in one document, the model probably won't be able to recall it. But it's an interesting question how many times the model has to see the fact to really internalize it.

Play episode from 16:47

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app