Struggling with k3s, Cilium, and BGP for VIP Setup

0
10
Asked By CloudyNinja42 On

Hey everyone! I've been working on a project involving k3s, Cilium, and BGP for managing a virtual IP (VIP), and honestly, I'm feeling really lost. I've spent over five days trying to get this configured correctly, but I think I've run into issues with asymmetric routing and hairpinning in my BGP setup. Here's a quick overview of my configuration:

- My network range is 10.10.1.0/24, with my router sitting at 10.10.1.1.
- The k3s nodes are named infra1 through infra8, having IPs from 10.10.1.11 to 10.10.1.18.
- The VIP for my service is assigned to 10.10.10.6 and I'm currently debugging with it pinned to infra1 (10.10.1.11).
- The service is set to externalTrafficPolicy: Local.

My setup includes various configurations for Cilium and BGP, and I've run tests to see where the routing breaks down. So far, I can access the VIP internally from the k3s nodes, but access from outside the cluster, like my laptop, is failing. It seems I can reach the services via DNS, which suggests something might be off with how TCP traffic is being handled.

I've tried several troubleshooting steps like adding static routes and modifying iptables rules but nothing seems to work consistently. The overall routing seems confusing, especially when it comes to how the return traffic is being routed. Any advice or insights would be hugely appreciated!

1 Answer

Answered By TechieTurtle99 On

First off, have you checked the forwarding settings on your k8s nodes? It’s also crucial to verify what the routing table looks like on your router. The fact you can’t ping or traceroute to the VIP from outside is expected because these addresses are typically handled differently in BGP. If your laptop is on the same LAN as your nodes, the routing might likely be asymmetric (going through the router but returning directly). You might need to set up a hairpin NAT rule to fix that issue.

CloudyNinja42 -

Thanks! I definitely believe you're on to something here. My laptop is on 10.10.1.100 while the nodes are in the 10.10.1.11-18 range. Sounds like the return path is using ARP instead of going through the VIP, right?

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.