I'm looking for recommendations on the best solutions to deploy a full Kubernetes (k8s) cluster on-premises. I'm starting this as a proof of concept (PoC), but definitely plan to use it for production services later. I've got three decent servers that I want to utilize.
I've come across k3s, but it doesn't seem suitable for large production clusters. I'm considering just using kubeadm to set everything up myself, including ingress, custom resource definitions (CRDs), and ensuring high availability (HA). I've also heard good things about Talos, but I want to start with a Debian 13 operating system.
The goal is to have a highly configurable and automated setup with support for network policies. If anyone has insights on how to architect this and what solutions to try, I'd really appreciate it!
8 Answers
I've been using k3s with Calico in a high-availability setup, and it's been fantastic! Fast updates, small footprint, plus the HA capability is definitely a plus!
I made the switch from AWS EKS to a self-hosted Talos setup, and it's been incredibly reliable. We're saving over $30k monthly and running five clusters without issues!
That's awesome! I was thinking about doing the same. I posted on a forum recently, and some people were really skeptical about the switch, so it's good to hear success stories.
What storage are you using with Talos? I've been using Longhorn, which works fine, but I'm curious about your setup!
Consider using OpenShift if you're building something larger. It’s solid on bare metal, though it does have higher hardware requirements.
Interesting, but it might be overkill for smaller setups.
Honestly, I would recommend going for Talos if you're self-hosting. It's solid and fully immutable, which makes it easier to manage. Just keep in mind it can be challenging to debug rare issues since it’s not designed to be mutable.
Absolutely! Talos is definitely a game-changer.
True, but that could complicate things when you're analyzing production issues that can't be replicated easily.
If you're going for a full install, here's a good stack:
- Deploy HA etcd with at least three masters.
- Use Longhorn for persistent volumes.
- RKE2 for cluster management.
- Implement Jenkins for CI/CD, ArgoCD for CD, and Grafana/Prometheus for monitoring.
- Nginx as ingress and MetalLB for load balancing.
These can all be configured via Helm charts for easier deployment!
Thanks for this! Are your nodes virtual machines or bare metal?
Can you have those masters on separate nodes in different regions with different public IPs? It seems like most resources assume they're in the same subnet.
We've had great success using MicroK8s in production. So far, I have no complaints, and it handles light workloads pretty well.
I’ve used it too! It's pretty solid but can be a bit resource-intensive.
RKE2 is an excellent path to go for larger clusters; it’s based on k3s and truly production-ready. We'll run it with Rancher, and it’s proven to be very solid overall.
I agree with RKE2! But I noticed that their documentation mainly lists Ubuntu, do you think it will work just as well on Debian?
For sure, there aren't many differences between the two, but package availability does vary.
If you're looking for something hands-on, kubeadm is the way to go. It's a bit painful at times, but understanding how things work under the hood is worth it.
Absolutely! Especially if you set it up with Cluster API; it makes things a lot easier.
Why create extra work for yourself if you can pick a more automated solution?

I started with k3s, but the resources I found were all for HA within the same private IP space. I want HA across different servers with public IPs. Can Calico support that?