Hey everyone,
I'm facing some challenges with my EKS cluster setup, particularly with the custom ENIConfig. I've set up my VPC with several subnets, but I'm not sure if I'm missing something in the configuration. Here are the subnets I've got:
- **ca-central-1b**: 10.57.230.224/27, subnet-0c4a88a8f1b26bc60
- **ca-central-1a**: 10.57.230.128/27, subnet-0976d07c3c116c470
- **ca-central-1a**: 100.64.0.0/16, subnet-09957660df6e30540
- **ca-central-1a**: 10.57.230.192/27, subnet-0b74d2ecceca8e440
- **ca-central-1b**: 10.57.230.160/27, subnet-021e1b90f8323b00
All CIDRs are already associated, and since I don't have control over the networking aspect, I'll have to build the EKS cluster with these available subnets. I selected the private subnets (10.57.230.128/27 and 10.57.230.160/27) during creation and used the recommended IAM policies, ensuring I'm using the default add-ons.
Once I provisioned the EKS cluster, I followed the instructions in the AWS documentation to enable custom ENIConfig. Given I only have one CIDR block for 100.64.0.0/16 in AZ ca-central-1a, I believe my worker nodes will only deploy in that zone to utilize the secondary ENI for pod networking.
After enabling custom networking and setting up the ENIConfig, I tried creating a node group using just the ca-central-1a subnet, including relevant IAM policies. However, when I create the nodegroup, it gets stuck in the creating state and eventually fails, stating that the nodes cannot join the cluster. I can't extract more info from the web console, but I'm starting to wonder if I need to divide my 100.64.0.0/16 block into two CIDRs across both AZs.
Has anyone here run into similar issues? I'm open to any suggestions or fixes! Thanks!
3 Answers
Did you enable custom networking before or after creating the nodes? If it was after, a full rollout of the nodes is necessary, as mentioned in the docs. Also, check the kubelet logs on the nodes for more detailed error messages; they can provide useful insights.
Keep in mind that a /27 CIDR only allows for a limited number of pods. If you're working with a range of /27, you'll be limited to about 28 pods. Just make sure you’re not exceeding that with your settings, or it could cause issues.
No worries there! I plan to use 100.64.0.0/16 for the pod networking with the custom ENIConfig.
First off, double-check your security groups. If the nodes aren't able to communicate properly, that could definitely be the reason they can't join the cluster. Ensure that your cluster can communicate with the security groups attached to your nodes.
I think the security groups are fine since the original nodegroup joined the cluster without issues. After enabling the custom ENIConfig, though, the existing nodes went 'NotReady', and the new group couldn't create.
I created the worker node group after setting up the custom ENI. But I’ll look into the kubelet logs as you suggested!