I'm currently auditing resource limits for EKS nodes across about 40 applications. My goal is to identify where resources are being wasted and propose adjustments to the limits and requests. For example, if an application uses around 531m of Memory but has a limit of 1000m (1GB), I know that limit needs to be adjusted. I'm considering where it should go; 600m seems too close. Is there a general guideline for setting resource limits?
Similarly, for a service using an average of 10.1m cores of CPU with a limit set at 1 core, I know CPU throttling isn't a service killer, but I'd like to know how close I can set the limit to the average usage. Any tips?
3 Answers
I typically look at the max memory usage over the last couple of months, add a buffer on top, and set limits that way. Don’t rely on average usage metrics for memory since they can be misleading, especially if an app spikes. For CPUs, I don’t often enforce strict limits since they’re easier to manage and less likely to starve the service if configured correctly. Just be cautious about CPU starvation in extreme cases—it can cause significant issues if the clients keep retrying aggressively.
Using KRR or VPA in recommender mode can help provide tailored suggestions for requests and limits. It's worth checking out if you're unsure where to start.
A common approach is to set the requests slightly above average usage and keep the limits much higher. The key is to avoid getting OOMkilled, so make sure to leave some buffer. For CPU, just establish what you consider reasonable usage for requests and keep limits significantly above that. Also, tools like Vertical Pod Autoscaler and Goldilocks can offer insights into optimal settings.
True, limits can incur costs, so balancing is crucial. Requests reserve resources on the node, so it won't go below that, right?

What kind of buffer do you usually add on top of that max memory usage?