KubeFM

Predictive vs Reactive: A Journey to Smarter Kubernetes Scaling, with Jorrick Stempher

Sep 9, 2025
Jorrick Stempher, a junior software engineer and student at Windersheim, discusses his team's innovative predictive scaling system for Kubernetes clusters, leveraging machine learning. They utilize the Prophet model to forecast load patterns, enabling preemptive scaling decisions that improve response times dramatically. Stempher dives into the Node Ranking Index for efficient resource management and shares insights on real-world challenges like data validation and load testing. The conversation highlights practical approaches to optimize Kubernetes scalability in dynamic environments.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Predictive Scaling Requires Lead Time

  • Predictive scaling avoids adding capacity only after nodes are already overloaded.
  • You must forecast far enough ahead to cover node startup and join time to be effective.
ANECDOTE

One Frontend Per vCPU

  • Testing showed Next.js frontends performed poorly when sharing vCPUs, so the team used a 1:1 vCPU-to-instance rule.
  • That finding became a default scaling rule for node sizing decisions.
INSIGHT

Account For 8–9 Minute Boot Time

  • Node startup time dominated the prediction horizon; their first-boot time averaged ~8–9 minutes.
  • Prediction windows must include provisioning, boot, and cluster-join time to meet demand on time.
Get the Snipd Podcast app to discover more snips from this episode
Get the app