Integrating Frameworks for Distributed Deep Learning

This chapter explores the integration of TensorFlow with libraries such as Horovod and NVIDIA's Nickel for distributed training in deep learning. It covers essential configurations for managing workloads across GPUs and servers, emphasizing technologies like RDMA and GPU Direct, while promoting open-source compatibility with other frameworks like PyTorch.

Play episode from 24:40

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app