Hey everyone! I'm feeling really overwhelmed with the resource requests and limits for our pods on EKS. Most of our services run on Java or Node, and it seems like every developer is inflating their requests way beyond what they actually need. For example, I see requests for 2 CPU and 4Gi of memory for apps that barely use 200m CPU and 500Mi of memory. I understand they want to be cautious, but this is really driving up our cloud costs, and our finance team is pushing us to cut down expenses. We've tried using VPA, but it doesn't work well for many of our workloads. HPA helps us scale out, but doesn't help with the mismatch between requested and actual usage. Right now, we're stuck continuously adjusting YAML files while checking Prometheus graphs and rolling out pods, which feels like a total waste of time. Has anyone found a solution? Any scripts or tools that work? I feel like I'm missing something obvious, but everything I try either disrupts workflows or requires constant monitoring. I'd love to hear any tips or solutions that have worked for you!
6 Answers
We've mostly moved to using limits instead of requests. It helps to monitor our applications over time for any outliers or 'rogue' apps that might consume more resources than expected. It’s a lot easier to manage!
Have you thought about Karpenter? It can really help with provisioning resources more appropriately. It's worth looking into if you haven’t already.
Try creating a tier list for resource requests. This way, developers can only ask for certain resource amounts based on how valuable the app is to the company. If an app isn't justifying its cloud costs, they need to either find a cheaper way to run it or advocate for its value. Either way, make it a bit harder for them to scale unnecessarily.
You might want to consider implementing cost attribution! By attributing costs to the developers, you can encourage them to lower their requests. Maybe have finance talk directly to the dev teams about it, so the pressure shifts away from you. It creates an incentive for them to manage their resource demands more efficiently.
Check out tools like Kubecost or OpenCost! We use them to provide efficiency reports and even set up a chargeback model to hold teams accountable for their usage. This way, everyone’s more mindful about their resource consumption.
I’d suggest setting requests and limits based on historical monitoring data and maybe incorporate some load testing. Don’t let the devs solely make those decisions, but definitely communicate with them about it. Team input is important!

Related Questions
Can't Load PhpMyadmin On After Server Update
Redirect www to non-www in Apache Conf
How To Check If Your SSL Cert Is SHA 1
Windows TrackPad Gestures