Hey everyone! I'm trying to set up my K8s cluster so that intra-cluster traffic stays within the same zone whenever possible. Here's a bit of background on my setup: I'm running on-prem with a vanilla K8s using MetalLB and Cilium as the CNI plugin. I've got three worker nodes divided into two zones — node-1 and node-2 in zone-1, and node-3 in zone-2.
Currently, I have two services: Service-A (the frontend) and Service-B (a backend HTTP server). Service-B has Pods running on all nodes for testing purposes (not guaranteed for production). What I want is for a Service-A Pod on node-1 to make requests to Service-B using its ClusterIP. Ideally, it should first check for a Service-B Pod on the same node, then look for one in the same zone, and finally, if that fails, look on any node across zones.
So far, I've been struggling to find a good solution. I considered Traffic Aware Routing, but it seems to only help with direct requests from worker nodes and not when requests come from Service-A Pods. When I test from a zone-1 worker node, the responses come from Pods in zone-1 only, but requests from a Pod are reaching Pods across all nodes. Am I missing something here? Is there a better way to achieve this? Thanks a lot!
3 Answers
You might want to check out how Envoy-based service meshes can be configured to respect node topology labels. With Cilium, you actually have the option to implement topology-aware hints directly on services, which helps in controlling traffic distribution based on zones. You can find more about this in Cilium's documentation and Kubernetes service routing docs — they go into detail about how to enable it!
Have you looked into using EndpointSlices? They can provide hints for zones in the endpoints and might help in controlling where requests go based on the zone.
Yeah! I saw that each .items.endpoints entry has a .hints.forZones section. Is there anything specific I should focus on?
Depending on your use case, you could consider splitting Service-B into multiple deployments with different affinities. For example, having Service-Ba and Service-Bb, where each is tied to a specific zone. Then, configure Service-A to call the appropriate service based on the zone. It could help keep your traffic localized!
Thanks for the tip! I mentioned Cilium not being relevant because I thought Pod selection happens at the kube-proxy level and that Cilium works afterward. Is that not the case?