System Operations

How can I manage excessive exited containers in my Kubernetes cluster?

December 5, 2025

Asked By CloudNinja77 On December 5, 2025

I'm running a Kubernetes cluster on OpenShift with Argo Workflows, and the client wants to keep their workflow runs for a while before cleaning up. Unfortunately, this has led to thousands of exited containers piling up on a single node. My coworker noticed that the kubelet was throwing gRPC errors and that the node was in a 'not ready' state before manually cleaning up the exited containers.

One error message we encountered was related to exceeding the message size limit: `rpc error: code = ResourceExhausted desc = grpc: received message larger than max (16788968 vs. 16777216)`. Additionally, the Multus CNI configuration file was missing, which seems odd to me.

During a recent test, we ran a cron job that spawned 10 containers over the weekend without cleaning them up, which caused the node to go into a 'not ready' state, leading to issues like being unable to SSH into it. The OpenStack logs were flooded with out-of-memory errors, as many processes, including Fluent Bit, and some .NET applications were killed due to excessive memory usage. What strategies or configurations can we implement to address this issue and manage the exited containers more effectively in our workflow?

3 Answers

Answered By OpsMaster3000 On December 7, 2025

It sounds like your resource requests for the containers might not be appropriate. You need to ensure that the resource requests match what your containers actually need to prevent out-of-memory situations.

MemorySaver12 - December 7, 2025

Interesting point, but it's worth noting that the containers were working fine on other nodes. The main issue here seems to be the excessive number of exited containers taking down the node.

Answered By TechGuru92 On December 6, 2025

Isn't the Kubernetes garbage collector supposed to manage old containers automatically? It feels like it should help keep things tidy without manual intervention.

ContainerExpert88 - December 7, 2025

The garbage collector probably only cleans up orphaned containers. Since you can still see these pods with `kubectl get pods`, they aren’t considered orphaned—so the GC won't touch them.

Answered By DevOpsDynamo On December 5, 2025

You might be running into a problem where your gRPC message size exceeded the default limit of 16 MB. If you have control over the client, consider adjusting the 'grpc.MaxCallSendMsgSize()' option to allow larger messages. Here's a link to the gRPC documentation on this if you're interested: [gRPC MaxCallSendMsgSize](https://pkg.go.dev/google.golang.org/grpc?utm_source=godoc#MaxCallSendMsgSize)

FixItFelix - December 7, 2025

Unfortunately, this limit is hardcoded in the kubelet’s CRI client. There's an issue open on GitHub discussing it, but if you're hitting this error, you’ve likely already hit a serious problem with your setup.

How can I manage excessive exited containers in my Kubernetes cluster?

3 Answers

Related Questions

Can't Load PhpMyadmin On After Server Update

Redirect www to non-www in Apache Conf

How To Check If Your SSL Cert Is SHA 1

Windows TrackPad Gestures

LEAVE A REPLY Cancel reply