Why are my pods getting stuck in an error state after scaling down to zero?

0
0
Asked By TechyTurtle42 On

I've been running into a problem with my nightly cron job that scales down pods to zero. Instead of terminating properly, they often end up in an 'Error' state. When I later scale the app back up, the new pods start just fine, but I'm left with these old pods stuck in error that I have to delete manually. This issue only seems to happen with one particular app; the others are functioning normally. Can anyone help me figure out what's going on?

4 Answers

Answered By DevGuru12 On

Another suggestion you could try is deleting any finalizers on those pods. Removing the finalizer may let the pods disappear from your list.

Answered By KubeNinja77 On

If the pods are not automatically cleaning up, you might need to delete them manually or wait for the garbage collector. Just so you know, by default, the terminated-pod-gc-threshold is set to 1250, so it won't kick in until you hit that limit.

Answered By DataDynamo21 On

When you describe the pod, are you seeing any errors in the state? For example, I noticed that some of my pods showed as 'Terminated' with 'Error' and an exit code of 137, which usually indicates they were OOMKilled (Out Of Memory). You might want to check the memory usage and limits in your `kubectl describe` output.

Answered By CloudChaser99 On

Have you checked the logs for those pods using `kubectl logs`? That might give you some insight into what's causing the error state.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.