I've been trying to set up Kubernetes' Horizontal Pod Autoscaler (HPA) using custom metrics from Prometheus, specifically focusing on HTTP latency (p95). However, I'm running into major issues with the scaling behavior—it seems too reactive, spiking up after my latency exceeds the SLO and then crashing down too quickly when things settle.
Here's what I've got:
- Metrics coming from a Prometheus Adapter using a PromQL query based on histogramquantile(0.95, ).
- My HPA is set to adjust replica counts between 3 to 15 based on these latency metrics.
I've attempted tweaking the --horizontal-pod-autoscaler-sync-period and cool-down windows, but they don't seem to help much. Is my approach to HPA wrong when it comes to these custom latency metrics? Would it make more sense to manage this using a service mesh like Envoy or Linkerd? I'd really appreciate any insights on how others have handled this without ditching HPA for options like KEDA or external event-driven scalers.
3 Answers
KEDA can simplify a lot of the scaling issues! However, if you're looking for fine-tuning, consider using stabilizationwindowseconds with your HPA. Setting a longer period for stabilization can help smooth out those 'thrashing' scales when traffic spikes. I suggest a stabilization window that’s 3-5 times longer than the evaluation period. This should help with your issue of flapping!
If you're looking at custom metrics like latency, KEDA may actually help you out. It doesn’t abandon HPA; instead, it enhances it by providing better input to the metrics server API. I highly recommend giving KEDA a try!
Instead of tweaking command line flags, make sure you check the settings directly on the HPA. You can adjust the scaling behavior right there in the HPA settings to control how quickly it should scale up or down. This might lead to fewer issues than trying to configure things externally. KEDA can definitely help, but it may just be a matter of bad configuration on the HPA.
Related Questions
How To: Running Codex CLI on Windows with Azure OpenAI
Set Wordpress Featured Image Using Javascript
How To Fix PHP Random Being The Same
Why no WebP Support with Wordpress
Replace Wordpress Cron With Linux Cron
Customize Yoast Canonical URL Programmatically