I'm working with a theoretical setup for a Java-based application that demands up to 2 cores at startup — that's its peak usage. After it fully starts, it drops to around 5% of a core, and during a nightly job, it could spike to about 15% of a core. The JVM settings have Xms at 3GB and Xmx at 4GB. The nodes available are 16 cores with 128GB of memory. I'd love to hear how others would approach setting up CPU and memory requests and limits for this kind of workload. What settings would you use and what's your role in this context? I'm a platform engineer and I'm suggesting a CPU request of 100m, with a limit of 3, and a memory request of 3GB, possibly going to a limit of 5GB. Would appreciate insights from other roles like application owners or different kinds of engineers!
6 Answers
If you're on AKS, there’s a feature called JAZ that could help manage your Java app's requests and limits dynamically—worth checking out if you haven’t yet!
If the startup CPU consumption isn’t an issue, I’d go for a request of 150m CPU. If it is critical, I’d set it at 2 CPU. Memory-wise, I’d recommend requesting around 4.5-5GB and also setting it to that limit to avoid crashes due to OOM but be cautious about overhead.
As someone who works with K8s regularly, my concern is about how that startup consumption affects everything else. You might consider your cluster's specific workloads. Are there heavy loads elsewhere that could be impacted if your app spikes? Definitely something to think about before finalizing your settings.
For CPU, I’d suggest a request of 150m without setting a limit. Since your application could jump to 15% CPU for jobs, ensuring a minimum of 150m guarantees it. For memory, I’d go for both requests and limits at 4GB to make sure the app gets what it needs during its operations.
I personally love using kube-startup-cpu-boost for handling temporary CPU spikes during startup instead of setting a permanent high limit. It really helps maintain normal operations without over-committing resources.
I’d recommend removing the Xms and Xmx settings and using MaxRAMPercentage instead. I usually start with a memory request that matches the limit and adjust MaxRAMPercentage until the pod stops crashing due to OOM errors. For the CPU, I recommend starting with a request of 100m, but if you're expecting heavier load from the nightly jobs, bump it to at least 150m.
Definitely keep an eye on performance metrics as you adjust these settings! Ensuring that you maintain good performance is key. Every setup is different, and sometimes small tweaks can lead to significant impacts.
Had a rough time with the K8S upgrade myself. We saw issues when our Java binaries didn’t support cgroups v2, leading to lots of OOM errors. So, be careful with Java's memory management!

We enforce limits in our clusters just to prevent any app from hogging resources due to spiky loads. It’s a bit of a safety net, especially when working with multiple teams.