Predictive vs Reactive: A Journey to Smarter Kubernetes Scaling, with Jorrick Stempher

Sep 9, 2025

Jorrick Stempher, a junior software engineer and student at Windersheim, discusses his team's innovative predictive scaling system for Kubernetes clusters, leveraging machine learning. They utilize the Prophet model to forecast load patterns, enabling preemptive scaling decisions that improve response times dramatically. Stempher dives into the Node Ranking Index for efficient resource management and shares insights on real-world challenges like data validation and load testing. The conversation highlights practical approaches to optimize Kubernetes scalability in dynamic environments.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Predictive Scaling Requires Lead Time

Predictive scaling avoids adding capacity only after nodes are already overloaded.
You must forecast far enough ahead to cover node startup and join time to be effective.

ANECDOTE

One Frontend Per vCPU

Testing showed Next.js frontends performed poorly when sharing vCPUs, so the team used a 1:1 vCPU-to-instance rule.
That finding became a default scaling rule for node sizing decisions.

INSIGHT

Account For 8–9 Minute Boot Time

Node startup time dominated the prediction horizon; their first-boot time averaged ~8–9 minutes.
Prediction windows must include provisioning, boot, and cluster-join time to meet demand on time.

Get the Snipd Podcast app to discover more snips from this episode

Get the app