I'm working on a solution to help minimize costs for EKS in non-production clusters. I've developed a Terraform module focused on scaling managed node groups down to zero when the clusters aren't active, which helps avoid charges since you can't stop EKS entirely. The mechanism uses AWS EventBridge along with Lambda functions for scheduling these scale operations, making it suitable for predictable environments like development and testing clusters (think shutting down during nights and weekends). If you have experience with similar setups or see any gaps in my approach, I'd love to hear your feedback!
3 Answers
Have you thought about using Karpenter? It's great for event-driven scaling based on pod demand, but it sounds like your need is more for scheduled scaling. It could work alongside your solution as well.
What about kube-downscaler? It seems like it could help with this situation too!
You might want to know, though, that the original kube-downscaler repo isn't maintained anymore. A new team has taken over and is adding a lot of new features and fixes, plus they're rewriting it in Go for better speed and efficiency. You can find the active versions at the new links!
Can't you just set up scheduled actions for Auto Scaling Groups? It might achieve the same outcome without needing additional tools.
Yeah, AWS does have the Instance Scheduler which is pretty straightforward to implement. I'm just not convinced your method adds much value beyond that.

That's true! Karpenter is excellent for production with unpredictable workloads. But as you mentioned, it doesn't offer the option for scheduled downscaling, which is where your approach shines.