Multi-Cluster Orchestrator, with Nick Eberts and Jon Li

33 snips

May 28, 2025

In this discussion, Nick Eberts, a Product Manager at Google focused on multi-cluster tooling, and Jon Li, a Software Engineer specializing in AI Inference, dive into the fascinating world of the Multi-Cluster Orchestrator (MCO). They explore its pivotal role in managing workloads across multiple Kubernetes clusters, tackling challenges like deployment efficiency and load balancing. The duo shares insights on optimizing resource use, auto-scaling capabilities, and the intricacies of maintaining cluster profiles for enhanced operational efficiency. Tune in for a blend of tech wisdom and innovative strategies!

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Kubernetes Cloud Assumptions Shift

Kubernetes was designed assuming infinite, uniform cloud capacity which is no longer true with accelerated hardware like GPUs.
Multi-region, multi-cluster solutions are needed to handle stockouts and varied hardware capabilities across regions.

INSIGHT

Balancing Cluster Size

Large clusters are possible but not always ideal due to control plane blast radius.
A balance of cluster size and application bin packing improves reliability and resource use.

INSIGHT

MCO Manages Scaling Efficiently

Multi-Cluster Orchestrator (MCO) solves scaling inferencing workloads from zero when needed to save on expensive GPU costs.
It recommends which clusters should run workloads based on capacity and preferences but does not deploy itself.

Get the Snipd Podcast app to discover more snips from this episode

Get the app