I recently stumbled upon a decade-old GitHub issue that discusses not needing a load balancer between the cluster and the control plane while still achieving high availability. Currently, Kubernetes requires a load balancer through some infrastructure provider. Imagine if you had three Linux servers and could just create a DNS record to point at those three IPs—how much easier would that be? If Client-Go could manage this, setting up on-prem clusters would be a breeze. What are your thoughts on this approach?
4 Answers
While it's not the most common method, it is possible to set up services that communicate directly without a load balancer. Some systems out there use the Kubernetes API to fetch endpoint IPs and connect directly, depending on how your Kubernetes networking is configured.
If you're looking to avoid load balancers, consider using RKE2. Otherwise, I'd recommend going with a dedicated load balancer solution like MetalLB, Kube-VIP, or HAProxy with Keepalived. There are plenty of good options available.
How accurate can client-side load balancing be when it doesn't have access to the overall load information from the API server? You would still need some redundancy to avoid a single point of failure. Using a Virtual IP or DNS could help here.
In my experience, client-side load balancing isn't a one-size-fits-all solution. But it would definitely simplify creating a highly available control plane for on-prem environments. Setting it up with three Linux servers and a simple DNS record would be a dream!
Check out this article I found about intelligent Kubernetes load balancing: https://www.databricks.com/blog/intelligent-kubernetes-load-balancing-databricks.
I appreciate the link, but that article addresses a different topic. I'm focusing on small to medium-scale setups and the API server access, not just services. I'm dreaming of a scenario where client-side load balancing to the Kubernetes API server could work seamlessly for tools like kubectl and helm.

At my previous job at Twitter, we relied on client-side load balancing to avoid single points of failure. This blog post provides some insight into our implementation: https://blog.x.com/engineering/en_us/topics/infrastructure/2019/daperture-load-balancer.