Hey everyone! I'm currently trying to find the best method to automate OS patching since I'm using Ubuntu for my K8s nodes. Right now, I'm relying on Ubuntu's unattended upgrades along with Kured, but I'm not a fan of this setup. The issue is that when the apt-get upgrade finishes, the rke2-server service restarts without draining the node first, which renders Kured pretty ineffective. I'm looking for suggestions on a more effective approach. Ideally, I'd like to first drain the node, run the necessary apt commands, reboot if needed, and then uncordon the node. Any tips?
6 Answers
You should definitely check out the rancher/system-upgrade-controller on GitHub. Regarding the rke2-server service restart, you can override its restart behavior in the needrestart configuration. For example, in /etc/needrestart/conf.d/rke2conf, set up the `nrconf{override_rc}{qr(^rke2.+.service)} = 0;` to prevent it from restarting when updates are applied.
I usually prefer to replace nodes rather than just rebooting them. I NetBoot them with a new image that already has the updates applied. Kured is a bit random with its restarts based on uptime, so things will reboot soon after new updates become available. You could tweak needsrestart to skip restarting the rke2 service automatically, then set a marker for Kured to manage those restarts instead.
Love that approach of instance expiry and automatic rotation! Makes things simpler.
Ubuntu actually creates a file at /var/run/reboot-required when it needs a reboot. You can configure unattended updates to avoid rebooting automatically and make sure Kured detects it as the signal to drain the node, reboot, and bring it back once ready.
I might not fully grasp the issue you’re facing, but we've been using Karpenter with EKS. It pulls the latest AMIs, and with a node TTL of 7 days, we rely on AWS to keep those AMIs updated.
Honestly, I think the easiest solution is to use a cluster API that suits your bare metal or VM provider. It simplifies the management, so you don't have to reinvent the wheel when there are existing solutions available.
If you want a more script-based approach, consider using Ansible along with Jenkins, GitHub Actions, or Rundeck to manage the whole process.

We’ve been using the system-upgrade-controller for a few years now, and it’s been working great for us!