I have an RKE cluster setup with 6 nodes, including 3 master nodes and 3 worker nodes. Recently, I removed the Docker containers from one of the worker nodes (specifically, the one with the address 10.10.10.14). I'm trying to find a way to restore those containers. Can anyone guide me on how to do that? I've checked the nodes' status with `kubectl get nodes -o wide`, and I can see that node 10.10.10.14 is currently marked as 'NotReady'. I also ran `docker ps -a` on both the troubled worker and a functional worker (10.10.10.15) to compare what's running. Any insights on recovering the containers would be greatly appreciated!
2 Answers
From what I understand, there's no rke2-agent running here, but the recent history indicates that containers were removed. Make sure you have the correct manifest file, whether it's the `cluster.yml` or another configuration. It's crucial to restore the proper settings for your cluster.
Have you considered simply restarting the rke2-agent? Another option is to move the manifests out and back into the folder to trigger a redeployment.
I actually found an old config.yml, but it's a bit outdated. If I want to only recover the specific node, can I safely delete the other nodes from the config? I'm concerned about removing anything critical since Docker is still running on that node.
Just to clarify, are you referring to the `cluster.yml` file? If you can't find it with `find / -name 'cluster.yml'`, it might be worth checking if it exists in a different directory or perhaps needs to be fetched from backup.