KubeFM cover image

KubeFM

Managing 100s of Kubernetes Clusters using Cluster API, with Zain Malik

May 20, 2025
33:13

Discover how to manage Kubernetes at scale with declarative infrastructure and automation principles.

Zain Malik shares his experience managing multi-tenant Kubernetes clusters with up to 30,000 pods across clusters capped at 950 nodes. He explains how his team transitioned from Terraform to Cluster API for declarative cluster lifecycle management, contributing upstream to improve AKS support while implementing GitOps workflows.

You will learn:

  • How to address challenges in large-scale Kubernetes operations, including node pool management inconsistencies and lengthy provisioning times

  • Why Cluster API provides a powerful foundation for multi-cloud cluster management, and how to extend it with custom operators for production-specific needs

  • How implementing GitOps principles eliminates manual intervention in critical operations like cluster upgrades

  • Strategies for handling production incidents and bugs when adopting emerging technologies like Cluster API

Sponsor

This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.

More info

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner
Get the app