I'm currently developing an app and setting up its infrastructure. The setup consists of a hub cluster that hosts Hashicorp Vault, Cloudflared (the tunnel), and Karmada (though I'm planning to replace Karmada with Flux's Hub and Spoke soon). Then, I have region-1 cluster connected to the hub using Linkerd. My main concern is with Linkerd's multi-cluster aspect. While it serves its purpose, it introduces a lot of sidecars, and I worry that as I scale this into a multi-region setup, managing the connections between clusters for cross-regional database syncs (like with CockroachDB) will become chaotic. I'm seeking advice on simpler solutions for cross-cluster networking. My research points towards building overlays like Nebula, but that feels like it involves even more manual work. Alternatively, I'm left with the complexity of Istio or Linkerd. I might be misdesigning something, so any help is appreciated!
6 Answers
You might want to clarify what you're aiming to achieve with this architecture. Creating a single application instance that spans multiple regions isn't usually the best route. Instead, designing smaller failure domains can help manage deployments and reduce risks of global outages. If a global app is indeed the goal, consider tools specifically designed for that, such as Cloudflare Workers.
If all regions are maintaining the same internal services, you should look into a Global Load Balancing Solution (GLBS). Cloudflare offers this, but it comes at a price. Alternatively, you can implement something similar using HAProxy, which can effectively manage your cross-regional traffic as evidenced by case studies like PayPal's at HAProxyConf.
Make sure your app even has users before diving into this complex infrastructure. Get it running well on a small scale, maybe even a Raspberry Pi. Focus on building your MVP for both the app and the infrastructure first, then think about scaling up. Once you start hitting major user growth, you can invest in a larger infrastructure and team to manage it effectively.
My recommendation is to keep it straightforward: avoid a full-mesh service mesh across regions. Instead, use a hub-and-spoke L3 network with a simple east-west gateway per cluster. Here's what worked for me: make the hub just for control (Vault, Flux), use peering for spokes, and for CockroachDB, assign stable addresses. For any inter-cluster HTTP communication, set up an Envoy or NGINX per cluster. Don't forget to keep user traffic centralized through Cloudflare; only use overlays for the east-west connections. I've had success with lightweight solutions like Submariner for discovery.
It sounds like your setup could benefit from a little simplification. Have you considered hosting your services in one region and using global services like AWS Global Accelerator or Cloudflare? This can help reduce latency for your users without the need for a complex cross-cluster network.
You're not doing anything fundamentally wrong; multi-region setups are just inherently complex. While Linkerd works initially, it can become unwieldy with multiple clusters. If secure communication and discovery are your primary needs, maybe skip the full mesh. Consider lighter alternatives like Cilium or even a basic WireGuard setup. Centralizing Vault while allowing CockroachDB to manage its cross-region syncing is a solid approach. Remember, only leverage a service mesh when it genuinely adds value, or you'll end up complicating things unnecessarily.

Related Questions
Can't Load PhpMyadmin On After Server Update
Redirect www to non-www in Apache Conf
How To Check If Your SSL Cert Is SHA 1
Windows TrackPad Gestures