The Node (and more) Banter
Many teams believe autoscaling is simple: set a CPU threshold and let Kubernetes handle it. But if you get a three-minute traffic spike and your pods need two minutes to start, users feel the lag, the spike ends, and your new pods show up too late. What if your infrastructure could predict traffic instead of just reacting? In this episode, Luca Maraschi and Matteo Collina challenge the usual autoscaling approach—HPA, KEDA, ECS, and more. They explain the predictive scaling algorithm Platformatic created for their Intelligent Command Center (ICC). Matteo explains why scaling is a nonlinear problem that the industry keeps trying to solve with linear solutions, and how thinking of a distributed system like a neural network can change how scaling decisions are made. In this episode, we cover: ✅ Why reactive autoscalers always use outdated data by design, and why this is a core flaw, not just a configuration issue ✅ The real cost of pod boot time that often gets ignored: spawn time, warmup, and traffic rebalancing ✅ Why CPU and memory are not the right metrics for Node.js, and what you should measure instead ✅ How Platformatic's algorithm checks event loop utilization from outside the thread, with no interference and no extra overhead ✅ The benchmark results: 99.47% success rate compared to 95% with KEDA and 90% with HPA, with P99 latency ten times better ✅ The 46-page white paper they published, and why they believe it's time to stop scaling out of fear The takeaways? Over-provisioning is not a safety net; it shows the model is broken. If your system cannot predict load, you will always have to choose between wasting money and getting worse performance. This episode, along with the white paper, argues that smarter scaling is now essential, not just a bonus.
61 episodios
Comentarios
0Sé la primera persona en comentar
¡Regístrate ahora y únete a la comunidad de The Node (and more) Banter!