
How Does ChatGPT Work? - ML 107
Adventures in Machine Learning
00:00
The Scale of Chat GPT Training Data
Chat GPT-3 had 499 billion tokens. 410 billion of them were common crawl, so essentially just webtext. Another 19 was a separate set of webtext. And then there were two sets of books, one with 12 billion tokens and one with 55 billion tokens. Ben: Can you start to describe the scale of this training data? Or is it even possible?
Transcript
Play full episode