I'm facing a persistent problem with low node utilization in Google Kubernetes Engine (GKE), similar to a bin packing issue. My setup includes a mixture of microservices and services with Horizontal Pod Autoscaling (HPA) enabled. Even though I've tuned pod requests and limits reasonably well, I'm still seeing high levels of unused CPU and memory. Node utilization often drops below 40%, even during peak times. I've tried the node auto provisioning feature, but it creates several node pools and leads to slow pod scheduling. I'm looking for some better solutions or suggestions to address this issue. Any help would be greatly appreciated!
4 Answers
Cast AI might be a good option, depending on your budget. It can help with node provisioning and keep your pod resources aligned more effectively.
This low utilization issue is quite common and is essentially a Tetris problem. Standard autoscalers like Karpenter may not fully address it since they mainly react to pending pods without optimizing the existing ones. You can tackle this by focusing on three main strategies: Rightsizing your resource requests, Defragmentation to rearrange your pods, and leveraging newer tools for workload-centric optimization. These tools can help dynamically adjust resource requests based on actual usage and identify pods that are blocking scale-downs, effectively reducing empty space on your nodes. Make sure to validate your Pod Disruption Budgets too, as they can affect scheduling.
Have you thought about using Karpenter? Just make sure you're not on GKE Autopilot, as it should work well for managing node autoscaling manually, allowing for more efficient resource use.
If you haven't tried it yet, give KRR a shot! It can provide recommendations for adjusting resources, which might help with the overall utilization problem.

Yeah, I've heard Karpenter is a solid choice for managing node scaling! It's reportedly still in alpha for GKE, but it's definitely worth checking out.