KubeFM
KubeFM
Discover all the great things happening in the world of Kubernetes, learn (controversial) opinions from the experts and explore the successes (and failures) of running Kubernetes at scale.
Episodes
Mentioned books
Oct 7, 2025 • 24min
How We Integrated Native macOS Workloads with Kubernetes, with Vitalii Horbachov
Vitalii Horbachov explains how Agoda built macOS VZ Kubelet, a custom solution that registers macOS hosts as Kubernetes nodes and spins up macOS VMs using Apple's native virtualization framework. He details their journey from managing 200 Mac minis with bash scripts to a Kubernetes-native approach that handles 20,000 iOS tests at scale.You will learn:How to build hybrid runtime pods that combine macOS VMs with Docker sidecar containers for complex CI/CD workflowsCustom OCI image format implementation for managing 55-60GB macOS VM images with layered copy-on-write disks and digest validationNetworking and security challenges including Apple entitlements, direct NIC access, and implementing kubectl exec over SSHReal-world adoption considerations including MDM-based host lifecycle management and the build vs. buy decision for Apple infrastructure at scaleSponsorThis episode is brought to you by Testkube—where teams run millions of performance tests in real Kubernetes infrastructure. From air-gapped environments to massive scale deployments, orchestrate every testing tool in one platform. Check it out at testkube.ioMore infoFind all the links and info for this episode here: https://ku.bz/q_JS76SvMInterested in sponsoring an episode? Learn more.
Oct 6, 2025 • 45min
The Making of Flux: The Rewrite, a KubeFM Original Series
In this episode, Michael Bridgen (the engineer who wrote Flux's first lines) and Stefan Prodan (the maintainer who led the V2 rewrite) share how Flux grew from a fragile hack-day script into a production-grade GitOps toolkit.How early Flux addressed the risks of manual, unsafe Kubernetes upgradesWhy the complete V2 rewrite was critical for stability, scalability, and adoptionWhat the maintainers learned about building a sustainable, community-driven open-source projectSponsorJoin the Flux maintainers and community at FluxCon, November 11th in Atlanta—register hereMore infoFind all the links and info for this episode here: https://ku.bz/bgkgn227-Interested in sponsoring an episode? Learn more.
Sep 30, 2025 • 48min
Scaling CI horizontally with Buildkite, Kubernetes, and multiple pipelines, with Ben Poland
Ben Poland, a senior staff platform engineer at Faire, dives into the transformation of CI systems from Jenkins to Buildkite. He discusses the challenges of scaling CI, addressing API throttling and optimizing workflows. Ben shares insights on splitting monolithic pipelines into service-scoped ones for better efficiency and how to manage CI across multiple Kubernetes clusters. Performance enhancements like Git mirroring and predictive provisioning are highlighted, leading to impressive results such as reduced failure rates and faster PR processing.
Sep 23, 2025 • 53min
Not Every Problem Needs Kubernetes, with Danyl Novhorodov
Danyl Novhorodov, a veteran .NET engineer and architect at Eneco, presents his controversial thesis that 90% of teams don't actually need Kubernetes. He walks through practical decision-making frameworks, explores powerful alternatives like BEAM runtimes and Actor models, and explains why starting with modular monoliths often beats premature microservices adoption.You will learn:The COST decision framework - How to evaluate infrastructure choices based on Complexity, Ownership, Skills, and Time rather than industry hypePlatform engineering vs. managed services - How to honestly assess whether your team can compete with AWS, Azure, and Google's managed container platformsEvolutionary architecture approach - Why modular monoliths with clear boundaries often provide better foundations than distributed systems from day oneSponsorThis episode is brought to you by Testkube—where teams run millions of performance tests in real Kubernetes infrastructure. From air-gapped environments to massive scale deployments, orchestrate every testing tool in one platform. Check it out at testkube.ioMore infoFind all the links and info for this episode here: https://ku.bz/BYhFw8RwWInterested in sponsoring an episode? Learn more.
24 snips
Sep 16, 2025 • 38min
VerticalPodAutoscaler Went Rogue: It Took Down Our Cluster, with Thibault Jamet
Thibault Jamet, Head of Runtime at Adevinta, shares his expertise running a multi-tenant Kubernetes platform. He dives into a chaotic incident where the Vertical Pod Autoscaler led to critical pod evictions. Thibault discusses the architecture of VPA and the debugging process that revealed hidden limits in Kubernetes. He emphasizes the importance of monitoring webhook latency and pod eviction rates to catch issues early. Listeners gain invaluable lessons on scaling challenges and operational strategies for maintaining high-performance systems.
Sep 15, 2025 • 22min
The Making of Flux: The Origin, a KubeFM Original Series
This episode unpacks the technical and governance milestones that secured Flux's place in the cloud-native ecosystem, from a 45-minute production outage that led to the birth of GitOps to the CNCF process that defines project maturity and the handover of stewardship after Weaveworks' closure.You will learn:How a single incident pushed Weaveworks to adopt Git as the source of truth, creating the foundation of GitOps.How Flux sustained continuity after Weaveworks shut down through community governance.Where Flux is heading next with security guidance, Flux v2, and an enterprise-ready roadmap.SponsorJoin the Flux maintainers and community at FluxCon, November 11th in Atlanta—register hereMore infoFind all the links and info for this episode here: https://ku.bz/5Sf5wpd8yInterested in sponsoring an episode? Learn more.
Sep 9, 2025 • 26min
Predictive vs Reactive: A Journey to Smarter Kubernetes Scaling, with Jorrick Stempher
Jorrick Stempher, a junior software engineer and student at Windersheim, discusses his team's innovative predictive scaling system for Kubernetes clusters, leveraging machine learning. They utilize the Prophet model to forecast load patterns, enabling preemptive scaling decisions that improve response times dramatically. Stempher dives into the Node Ranking Index for efficient resource management and shares insights on real-world challenges like data validation and load testing. The conversation highlights practical approaches to optimize Kubernetes scalability in dynamic environments.
Sep 2, 2025 • 35min
Solving Cold Starts: Uses Istio to Warm Up Java Pods, with Frédéric Gaudet
If you're running Java applications in Kubernetes, you've likely experienced the pain of slow pod startups affecting user experience during deployments and scaling events.Frédéric Gaudet, Senior SRE at BlaBlaCar, shares how his team solved the cold start problem for their 1,500 Java microservices using Istio's warm-up capabilities.You will learn:Why Java applications struggle with cold starts and how JIT compilation affects initial request latency in Kubernetes environmentsHow Istio's warm-up feature works to gradually ramp up traffic to new podsWhy other common solutions fail, including resource over-provisioning, init containers, and tools like GraalVMReal production impact from implementing this solution, including dramatic improvements in message moderation SLOs at BlaBlaCar's scale of 4,000 podsSponsorThis episode is brought to you by Testkube—the ultimate Continuous Testing Platform for Cloud Native applications. Scale fast, test continuously, and ship confidently. Check it out at testkube.ioMore infoFind all the links and info for this episode here: https://ku.bz/grxcypt9jInterested in sponsoring an episode? Learn more.
Aug 26, 2025 • 28min
Teaching Kubernetes to Scale with a MacBook Screen Lock, with Brian Donelan
Brian Donelan, VP of Cloud Platform Engineering at JPMorgan Chase, shares his innovative side project that automates Kubernetes workload scaling based on MacBook screen lock status. He connects macOS notifications to CloudWatch, achieving impressive 80% cost savings by scaling resources to zero when idle. The discussion highlights KEDA's unique event-driven scaling capabilities, creative metrics for different industries, and strategies for optimizing cloud resource usage, making workload management more efficient and sustainable.
Aug 19, 2025 • 41min
Building a Carbon and Price-Aware Kubernetes Scheduler, with Dave Masselink
Data centers consume over 4% of global electricity and this number is projected to triple in the next few years due to AI workloads.Dave Masselink, founder of Compute Gardener, discusses how he built a Kubernetes scheduler that makes scheduling decisions based on real-time carbon intensity data from power grids.You will learn:How carbon-aware scheduling works - Using real-time grid data to shift workloads to periods when electricity generation has lower carbon intensity, without changing energy consumptionTechnical implementation details - Building custom Kubernetes schedulers using the scheduler plugin framework, including pre-filter and filter stages for carbon and time-of-use pricing optimizationEnergy measurement strategies - Approaches for tracking power consumption across CPUs, memory, and GPUsSponsorThis episode is brought to you by Testkube—the ultimate Continuous Testing Platform for Cloud Native applications. Scale fast, test continuously, and ship confidently. Check it out at testkube.ioMore infoFind all the links and info for this episode here: https://ku.bz/zk2xM1lfWInterested in sponsoring an episode? Learn more.


