

Multi-Cluster Orchestrator, with Nick Eberts and Jon Li
33 snips May 28, 2025
In this discussion, Nick Eberts, a Product Manager at Google focused on multi-cluster tooling, and Jon Li, a Software Engineer specializing in AI Inference, dive into the fascinating world of the Multi-Cluster Orchestrator (MCO). They explore its pivotal role in managing workloads across multiple Kubernetes clusters, tackling challenges like deployment efficiency and load balancing. The duo shares insights on optimizing resource use, auto-scaling capabilities, and the intricacies of maintaining cluster profiles for enhanced operational efficiency. Tune in for a blend of tech wisdom and innovative strategies!
AI Snips
Chapters
Transcript
Episode notes
Kubernetes Cloud Assumptions Shift
- Kubernetes was designed assuming infinite, uniform cloud capacity which is no longer true with accelerated hardware like GPUs.
- Multi-region, multi-cluster solutions are needed to handle stockouts and varied hardware capabilities across regions.
Balancing Cluster Size
- Large clusters are possible but not always ideal due to control plane blast radius.
- A balance of cluster size and application bin packing improves reliability and resource use.
MCO Manages Scaling Efficiently
- Multi-Cluster Orchestrator (MCO) solves scaling inferencing workloads from zero when needed to save on expensive GPU costs.
- It recommends which clusters should run workloads based on capacity and preferences but does not deploy itself.