Adventures in Machine Learning cover image

How Does ChatGPT Work? - ML 107

Adventures in Machine Learning

00:00

The Longest Training Time for a Deep Learning Model

Microsoft built a supercomputer for OpenAI. It has 285,000 CPU cores, 10,000 GPUs and 400 gigabytes per second of network connectivity. But with all that, it still took months for the model to train. Ben: What's the longest training time you've ever seen for a model you've been working on?

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app