I'm dealing with a persistent issue in Google Kubernetes Engine where we're facing low node utilization—often less than 40% during peak times. We run a standard GKE setup with a mix of microservices and Horizontal Pod Autoscaling (HPA), and while our pod requests and limits are reasonably tuned, we still see a lot of unused CPU and memory. We've tried using the node auto-provisioning feature, but it ends up creating multiple node pools and slows down pod scheduling. I'm looking for better solutions or suggestions to tackle this bin packing problem effectively. Any advice would be greatly appreciated!
4 Answers
Have you considered using Karpenter? It’s a solid tool for managing autoscaling unless you're on GKE Autopilot. Karpenter helps with more efficient node scaling and utilizes resources better. Just make sure it's working well in the right environment!
This is a classic Tetris problem! Autoscalers like Cluster Autoscaler and Karpenter mainly react to pending pods, but they don't optimize already running ones. To truly improve your utilization, you'll need to focus on three main areas: rightsizing your pod requests, defragmenting workloads, and maybe using a workload-centric optimization approach. Tools like KRR can help report on pod requests, but they won't fix the fragmentation automatically. You might need a smarter solution that actively adjusts requests based on current usage, like ScaleOps.
Good point! If you can get your requests to match reality, that could solve a lot of problems on its own.
If you haven't checked it out yet, give KRR a try. It’s a tool that can recommend resource changes and help you get things optimized.
You might want to look into Cast AI, depending on your budget. It can assist with auto-scaling and resource allocation more effectively.

Yeah, Karpenter is pretty great for node management but keep an eye on how it handles things, especially if you're using multiple node pools.