Discover how to manage Kubernetes at scale with declarative infrastructure and automation principles.
Zain Malik shares his experience managing multi-tenant Kubernetes clusters with up to 30,000 pods across clusters capped at 950 nodes. He explains how his team transitioned from Terraform to Cluster API for declarative cluster lifecycle management, contributing upstream to improve AKS support while implementing GitOps workflows.
You will learn:
How to address challenges in large-scale Kubernetes operations, including node pool management inconsistencies and lengthy provisioning times
Why Cluster API provides a powerful foundation for multi-cloud cluster management, and how to extend it with custom operators for production-specific needs
How implementing GitOps principles eliminates manual intervention in critical operations like cluster upgrades
Strategies for handling production incidents and bugs when adopting emerging technologies like Cluster API
Sponsor
This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/5PLksqVlk
Interested in sponsoring an episode? Learn more.