System Operations

How can I prevent Karpenter from killing my pod during consolidation?

January 7, 2026

Asked By TechWhiz42 On January 7, 2026

I have a long-running deployment, called Service X, that operates during the evenings for a scheduled event. During off-hours, the load on the cluster decreases significantly, prompting Karpenter to aggressively consolidate resources, which ends up removing nodes and consolidating pods onto fewer instances. The issue arises when Service X gets rescheduled during this consolidation, which takes about 2 to 3 minutes to be ready again. In that downtime, if another service attempts to fetch data from Service X, it causes a noticeable outage. I'm considering a couple of options like running Service X on a dedicated node or marking the pod as non-disruptable to prevent eviction. However, both solutions feel too heavy-handed or could drive up costs. Is there a more cost-effective way to manage this issue, given the long startup time, intermittent traffic, and Karpenter's aggressive node consolidation, without locking capacity or completely disabling consolidation?

3 Answers

Answered By CloudGuru89 On January 9, 2026

Have you thought about utilizing a pod disruption budget (PDB)? This allows you to control how many pods can be disrupted during events like consolidation. You can find more info about it on Karpenter's documentation. It could help ensure that at least one instance of your service remains available while pods are being rescheduled.

TechWhiz42 - January 10, 2026

Yeah, I did consider that option!

Answered By DevOpsDude88 On January 9, 2026

Why not have your service running with two pods that are set to always run on separate nodes? This, combined with a PDB, could prevent any disruptions. It seems like a straightforward fix to me.

TechWhiz42 - January 10, 2026

I had the same thought and pushed the developer for it. But due to tight deadlines, they managed to implement neither the uniqueness nor duplicates for the pods. Now they’re asking me to find another solution, with the last option being a dedicated instance until they can sort it out.

Answered By K8sMaster101 On January 7, 2026

If it's feasible, consider scaling up your deployment with multiple instances. You might want to implement a PDB with a minimum of 1 available pod and run two replicas. This could help maintain service availability during consolidation.

TechWhiz42 - January 10, 2026

Currently, that's not possible since the service needs to fetch data from an external vendor first and store it in our database. The dev team opted for a simple architecture due to time constraints.

How can I prevent Karpenter from killing my pod during consolidation?

3 Answers

Related Questions

Can't Load PhpMyadmin On After Server Update

Redirect www to non-www in Apache Conf

How To Check If Your SSL Cert Is SHA 1

Windows TrackPad Gestures

LEAVE A REPLY Cancel reply