Configuring requests & limits with the HPA at scale, with Alexandre Souza
Sep 24, 2024
Alexandre Souza, a senior platform engineer at Getir, dives into the art of managing large-scale Kubernetes environments. He uncovers the pitfalls of over- and under-provisioning while detailing strategies for optimizing resource requests and limits. Expect insights on configuring the Horizontal Pod Autoscaler (HPA) effectively, and the importance of balancing CPU and memory for better performance. Souza also discusses automation tools like KubeCost and StormForge, alongside tips for fostering team buy-in for resource management practices.
AI Snips
Chapters
Transcript
Episode notes
Getir's Cluster Scale And Structure
- Getir's dev cluster hosts about 146 namespaces and 2,200+ workloads with little governance.
- Production has ~46 namespaces and ~797 workloads, peaking at 8–9k pods during spikes.
Set Conservative Requests From Observed Usage
- Use conservative resource requests based on monitored usage rather than arbitrary high values.
- Monitor actual CPU and memory to avoid overpaying and reduce overprovisioning.
Underestimating Requests Is Riskier At Scale
- Underestimating requests risks evictions, performance degradation, and contention in large, dynamic clusters.
- Small static clusters tolerate lower requests more easily than large multi-tenant clusters.
