I'm curious why it's important to drain a node before upgrading it in a Kubernetes cluster. What are the consequences of not doing this? Also, if a node suddenly goes down, how does Kubernetes handle the pods that were running on it?
5 Answers
If the node does go down and doesn't come back, you might end up with pods stuck in an "Unknown" state. This happens because Kubernetes can’t figure out what's going on with those pods, and you'll have to manually evict or delete them. Draining the node helps by allowing Kubernetes to automatically move the pods to another node, minimizing downtime. Plus, it’s a good idea to cordon off the node before draining it to prevent new pods from being scheduled there.
When a node goes down unexpectedly, Kubernetes can't discern whether the pods are still functional. They just get flagged as unknown until the node is back online. To tackle this, you should drain the node first or outright delete it from Kubernetes; both will trigger the rescheduling of the pods, but draining is more considerate of resource allocation and disruption budgets.
Essentially, if you don’t drain the node and the pods don't maintain their replicas across nodes, you're setting yourself up for some possible outages. It’s about making sure your workloads are resilient and can tolerate issues.
If a node abruptly goes down, you’ll likely need to manually delete the affected pods because Kubernetes won't manage them correctly. Draining ahead of time is safer; if you've set enough replicas and established proper affinity rules, it can help prevent outages by ensuring not all pods from the same deployment are on one node.
It really depends on what's running on that node. For instance, I've noticed that rook-ceph can sometimes crash if the kubelet is restarted unexpectedly. You want to make sure your setup can handle downtimes smoothly.
Do you have any links to GitHub issues related to this? I’ve been testing rook ceph a lot but haven’t encountered those issues.
Are you talking about general setups? Like a StatefulSet hosting PostgreSQL?