Vanishing Gradients cover image

Episode 40: What Every LLM Developer Needs to Know About GPUs

Vanishing Gradients

00:00

Quick Insights on Training GPT-2 with Modern Hardware

This chapter explores the efficiency of training GPT-2 from scratch in just five minutes using 8x H100 GPUs on the Modal platform. The discussion highlights the cost-effectiveness and speed of the training process compared to traditional methods.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app