

Building the AI Hyperscaler with Kubernetes
Jun 28, 2024
Brandon Jacobs, Infrastructure architect at Coreweave, discusses how Coreweave uses Kubernetes to build an AI hyperscaler. They cover managing Day 0 & 2 operations for AI labs, lessons learned, and best practices for a Kubernetes based cloud. Topics include leveraging bare metal Kubernetes for GPU workloads, storage options for AI labs, observability, monitoring, handling CVEs, and customer cluster support.
Chapters
Transcript
Episode notes
1 2 3 4 5 6 7
Intro
00:00 • 6min
Implementing Kubernetes at Core Weave Cloud
05:35 • 9min
Leveraging Bare Metal Kubernetes for GPU Workloads
14:29 • 23min
Storage Options for AI Labs and Kubernetes Challenges
37:23 • 6min
Importance of Observability, Monitoring, and Handling CVEs in a Kubernetes Environment
43:49 • 3min
Customer Cluster Support and Expansion
46:26 • 4min
Exploring coreWeave's Kubernetes Journey and Future Plans
50:35 • 4min