
How Does ChatGPT Work? - ML 107
Adventures in Machine Learning
00:00
The Longest Training Time for a Deep Learning Model
Microsoft built a supercomputer for OpenAI. It has 285,000 CPU cores, 10,000 GPUs and 400 gigabytes per second of network connectivity. But with all that, it still took months for the model to train. Ben: What's the longest training time you've ever seen for a model you've been working on?
Transcript
Play full episode