
Anthropic Head of Pretraining on Scaling Laws, Compute, and the Future of AI
Y Combinator Startup Podcast
00:00
Improving GPU Utilization and Distributed Training
Ankit asks how they squeezed hardware; Nick covers parallelism strategies, profiling, MFU, and modeling bottlenecks.
Transcript
Play full episode