Help with Calico Issues on CentOS 9 Kubernetes Setup

0
1
Asked By TechieGuy321 On

I'm setting up a Kubernetes cluster using Kubespray on a 3-node CentOS 9 environment. The structure is: one control plane node that also runs etcd and is a worker node, plus two additional nodes for workloads. I've opted to use Calico as the CNI provider, and while all nodes are registering correctly, I'm encountering problems with the calico-kube-controllers pod failing to create due to a network setup timeout. The error message I receive is: "Failed to create pod sandbox: ... plugin type="calico" failed (add): error getting ClusterInformation: ... Operation timed out".

I created firewall zones for internal app communication on each node and opened the necessary ports, but I suspect there might be an issue with my network or firewall configuration since this setup worked on Ubuntu 24 without issues. I've also disabled firewalld completely to troubleshoot, but the calico pods still remain in a 'ContainerCreating' state. Anyone experienced similar issues or have suggestions to get around this? I can provide logs and details if needed!

3 Answers

Answered By NetworkNinja75 On

Sounds like you're facing a typical issue with Calico where it can't connect to the Kubernetes API server to retrieve cluster information. Here are a few things you could check:

1. Verify that SELinux isn't blocking the CNI plugins; you can temporarily set it to permissive mode using `setenforce 0` to test this.
2. Double-check your firewall rules. The custom firewalld zones you've set up might not be allowing the required traffic between your nodes and the API server.
3. Make sure your nodes can ping 10.233.0.1:443. You can try using `curl` or `nc` to test the connection.
4. Take a look at your DNS setup as the connectivity issues you’re experiencing can often stem from DNS failures which lead to the pods being stuck in a 'ContainerCreating' state.

Try these steps and let us know what you find!

DevMaster99 -

Great tips! Just to add, disabling firewalld completely sometimes helps isolate issues like this—after that, reenable it gradually, re-adding your rules to see what might be blocking the connection.

Answered By KubeWizard42 On

Hey, have you checked our k8s logs for any specific pod failures? It looks like the coredns pod is also having issues similar to calico. Often these connectivity issues could be linked. You might want to inspect those logs and confirm if the API server is reachable from that pod too. If your curl command to the API server is returning a 403 Forbidden, it indicates permissions might be a problem—check the RBAC configurations as well!

TechieGuy321 -

Thanks! I’ll look through the logs and permissions settings next. The curl response was definitely a heads-up that something in the access controls might need adjustment.

Answered By K8sGuru87 On

Hi there! I'm experiencing something similar. One thing that worked for me was to ensure that my /etc/hosts file on each node correctly maps the API server's IP address. Sometimes, network resolution can cause issues that seem like firewall or CNI-related failures. Also, haven't you tried restarting just the kubelet services instead of the whole nodes? Sometimes that can help clear up lingering state issues!

TechieGuy321 -

I hadn't thought of that! I'll give it a shot and see if it makes any difference. Thanks for the suggestion!

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.