Latent Space: The AI Engineer Podcast cover image

ICLR 2024 — Best Papers & Talks (ImageGen, Vision, Transformers, State Space Models) ft. Durk Kingma, Christian Szegedy, Ilya Sutskever

Latent Space: The AI Engineer Podcast

CHAPTER

Optimizing Communication in GPU Clusters

This chapter explores the challenges posed by communication bottlenecks in large GPU clusters during model training, introducing a groundbreaking optimization technique called Xero++. It covers innovations like block-based quantization and the novel all-to collective design to improve training efficiency while managing overheads. The discussion highlights the evolution of DeepSpeed technology and its strategies to enhance throughput and mitigate accuracy trade-offs in large-scale machine learning training.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner