Hey everyone! I'm in the process of designing an Amazon EKS cluster and I could use some advice. My setup will include a monitoring stack with VictoriaMetrics, Grafana, and Loki, along with databases and potentially some stateless microservice pods. I'm planning to leverage Karpenter for autoscaling and provisioning, and I wanted to get your thoughts on the structure I've proposed:
1. A NodePool for stateful applications focused on memory, allowing consolidation only when empty, using a taint of `karpenter.sh/stateful: NoSchedule` and a label of `karpenter.sh/stateful: true`.
2. A separate NodePool for stateless applications using spot instances and enabling full consolidation.
This setup allows me to configure the CSI EBS DaemonSet to run only on nodes that need it thanks to the node affinity based on the stateful label, optimizing resources and preventing Karpenter from deleting stateful nodes since there will always be active resources assigned to them.
What do you think about this approach?
3 Answers
You can also avoid potential disruptions by using the 'do not disrupt' annotation on your stateful sets. This way, Karpenter won't cycle those nodes as you're provisioning. Just keep in mind the termination grace period for your node pools since that can affect things too.
Yeah, definitely something to consider for maintaining uptime!
It seems like you might be complicating things a bit. You don't necessarily need two separate node pools for this setup. Using the 'capacity-type' label could simplify your selectors a lot. Instead, consider a single node pool where you can specify the key as 'karpenter.sh/capacity-type' with values like 'spot', 'on-demand', or 'reserved' and apply a node selector for your on-demand instances. That way, you can still manage your stateful and stateless apps efficiently without the overhead of multiple node pools.
I appreciate that suggestion! So just to clarify, if I set it up like this, it should resolve the issue without needing multiple node pools, right? That sounds much easier!
Yep, exactly! This way, you keep everything neater and just make sure to use the right selectors.
Your idea generally makes sense, especially the way you're separating spot-backed stateless workloads from stateful ones. However, just double-check if limiting the EBS CSI DaemonSet by label is really worth the added complexity—it’s often lightweight. And ensure your disruption budgets and storage class settings are aligned to avoid any hiccups with Karpenter in tricky situations.
Thanks for your insight! I’ll make sure to review those aspects.
Absolutely, it’s better to be safe and align everything to prevent issues!

Good point! I didn't think about the disruption aspect. Thanks for the tip!