Hey everyone,
I've got some worker pods that occasionally run for a long time, and I've set a termination grace period for them. I'm wondering if there's a straightforward way to monitor when a pod is actually killed because it exceeded that termination grace period. I'm looking to set up a monitor in Datadog for this issue. From what I understand, kubelet may not send a specific event for this scenario.
Thanks a ton for your help!
2 Answers
It sounds like you're dealing with some pods being killed before they finish their tasks, which might mean your termination grace period is set too low. You definitely want to monitor this situation to ensure your workers are being given enough time to shut down gracefully. Have you considered using a tool like Grafana to set up some monitoring for that? It can really help you get the insights you need.
Exactly! The termination grace period only comes into play when the pod is shutting down, so if it runs out and the pod is still active, it gets forcefully killed. This can definitely be a scenario worth tracking to avoid data loss. I'm also interested in finding a solution for this!

Related Questions
Can't Load PhpMyadmin On After Server Update
Redirect www to non-www in Apache Conf
How To Check If Your SSL Cert Is SHA 1
Windows TrackPad Gestures