The Everything Feed - All Packet Pushers Pods

HN782: Netris Meets Your Network Automation Challenges in AI Data Centers (Sponsored)

18 snips
May 23, 2025
Alex Saroyan, CEO and co-founder of Netris, delves into the exciting realm of network automation in AI data centers. He discusses the complexities of multi-tenancy, particularly in optimizing resource management for costly GPUs. The conversation highlights the advantages of InfiniBand over Ethernet and innovative solutions like NVIDIA's adaptive routing for enhanced GPU performance. Saroyan also explores the pivotal role of Smart NICs in network management, showcasing Netris's vision for automating cloud environments and ensuring efficient, scalable networks.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

AI Data Center Infrastructure Needs

  • AI data centers need reinvented infrastructure to meet GPU demand, including power, cooling, compute, and networking.
  • GPU clustering requires extremely low-latency, high-throughput networking to prevent idle GPUs and maximize math processing.
INSIGHT

Networking Solves GPU Memory Expansion

  • GPUs operate only on data in their own memory, so networking solutions must effectively connect multiple GPUs' memories.
  • This memory expansion across GPUs is a core networking challenge in AI data centers.
INSIGHT

Dual Fabrics in AI Data Centers

  • AI data centers have dual network fabrics: front-end for production/storage access, back-end optimized for GPU-to-GPU communication.
  • The back-end fabric dedicates a 400Gbps NIC per GPU to enable extreme east-west traffic with minimal contention.
Get the Snipd Podcast app to discover more snips from this episode
Get the app