System Operations

Best Practices for Setting Up a Highly Available Kubernetes Cluster Across Two Data Centers?

July 21, 2025

Asked By TechWhiz42 On July 21, 2025

Hey everyone! I'm diving into setting up a production-grade, highly available Kubernetes cluster on-premises that spans across two physical data centers. I've got hands-on experience with Kubernetes in the cloud, but now my upper management is pushing for a specific plan that I'm not totally on board with. They want me to run both the Master and Worker roles on a single physical server in each data center, essentially creating a setup with just two nodes for now, and I'm concerned about quorum and overall reliability.

Here's what I'm working with:
- Two big bare metal servers (one in each DC)
- A dedicated 100 Gbps link connecting the two data centers
- In about 7 months, we're expecting to add a third data center and server
- The goal is to deploy an internal AI platform using Helm charts

I'm looking for some guidance on how to design for high availability right from the start with these resources:
1. What's the best approach to establishing HA with only two nodes?
2. How do I handle etcd quorum until the third node is in play? Could an Active-Passive setup be worth considering?
3. What are your thoughts on networking, load balancing, and the choice between overlay vs underlay for pod traffic?
4. Any tips for managing secrets safely for pulling Helm charts?
5. What tools or stacks do you recommend for bare-metal automation?

I'd really appreciate any insights you all might have before I present this to my team tomorrow!

4 Answers

Answered By ServerSage88 On July 24, 2025

I agree with CloudHunter. Running active-passive with only two servers is risky—if the network goes down, you're stuck without a quorum. I’d focus on each data center as a separate cluster for now and use a primary/secondary model. If you need more reliability, you’ll really want additional servers at each site.

TechWhiz42 - July 24, 2025

Thanks for the insight! I was trying to convince my manager that we need more machines for a proper HA setup, but it seems upper management isn’t budging on the current plan.

Answered By K8sGuru88 On July 23, 2025

You definitely need three servers for a high-availability control plane setup. With only two, you're not really achieving HA. If management insists on using these servers, maybe consider something simple like Kind (Kubernetes in Docker) on VMs instead.

Answered By NetWiz82 On July 23, 2025

Your latency between control nodes is also a huge factor! Ideally, you want it under 30 ms for etcd operations. If you're really committed to a multi-regional setup, using something like Cilium for cluster mesh could be beneficial, but keep in mind it has its complexities. Let me know if you want to discuss it further!

Answered By CloudHunter99 On July 22, 2025

Honestly, it might be more efficient to just consolidate all servers into one data center. Trying to maintain high availability with only two nodes spread out like this isn't ideal, especially since true HA requires a minimum of three nodes for control plane redundancy, and network latency between nodes is crucial.

Best Practices for Setting Up a Highly Available Kubernetes Cluster Across Two Data Centers?

4 Answers

Related Questions

Can't Load PhpMyadmin On After Server Update

Redirect www to non-www in Apache Conf

How To Check If Your SSL Cert Is SHA 1

Windows TrackPad Gestures

LEAVE A REPLY Cancel reply