How can we handle unevictable pods hogging node resources?

0
6
Asked By CuriousLynx93 On

I'm looking for advice from a FinOps perspective regarding our Kubernetes cluster. I've noticed that many of our nodes are only utilizing about 20-30% of their capacity, which seems like a good opportunity to consolidate and lower our node count. However, the DevOps team tells me that some pods are effectively unevictable, which prevents us from draining those nodes. The reasons behind this include pod disruption budgets, local storage requirements, strict affinities, and sometimes just a lack of alternative nodes that can host these pods. So while it seems like we have idle nodes, they're actually kept alive by one or two pods. I understand the hesitation from the DevOps side, but it's frustrating from a financial perspective to see our capacity committed to these underutilized nodes. What strategies do teams usually implement to address this issue? How can I propose a solution to the DevOps team without coming off as overly simplistic, like merely suggesting they move the pods?

5 Answers

Answered By WiseOwl42 On

Your DevOps team has valid reasons for their stance. For example, they might have topology constraints and anti-affinity rules to ensure maximum uptime during outages. Keeping certain pods on specific nodes is often essential for stability, even if those nodes could technically be scaled down.

Answered By NodeNerd101 On

One approach is to identify and isolate the workloads causing the issue. By creating smaller nodes specifically for them, you can minimize waste. If these pods are known and manageable, you can use taints and affinities to better schedule them.

Answered By CloudGuru42 On

Using tools like Descheduler and Karpenter can help address issues with pod disruption budgets and affinities, allowing you to better manage your resources.

Answered By TechSavvyHamster On

It's important to weigh your options. Is it more costly to handle occasional refunds when the system can't scale up, or should you keep some extra infrastructure running? Yes, you can optimize node packing, but Kubernetes will pull additional nodes as needed. Be sure to consider how efficiently the code in those pods runs; if it's using more resources than necessary, that could be part of the issue.

Answered By EfficientPenguin On

Consider setting up a dedicated pool of nodes that won't be scaled down. If the DevOps team can't evict pods, they should be using a node selector to ensure those pods run in this special pool.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.