Hey everyone! I'm facing some issues with CPU and memory settings for my Spring Boot microservices deployed on EKS. We have 6 microservices (Java 17, Spring Boot 3.5.4), most of which are I/O-bound while a few are memory-heavy but none are CPU-bound. We've got the Horizontal Pod Autoscaler (HPA) set up, but we're running into a few roadblocks with the following setup:
- Deployment YAML has requests for CPU set at 750m and memory at 850Mi, with limits at 1250m for CPU and 1150Mi for memory.
- We use the eclipse-temurin:17-jdk-jammy image with the flag -XX:MaxRAMPercentage=50.
- During idle time, memory usage is around 520Mi and under traffic it peaks at approximately 750Mi.
- HPA targets 80% CPU and memory usage, but currently, we're experiencing only 1% CPU usage and 83% memory usage, with all 6 pods running in a ScalingLimited state.
Here are some of the issues we've observed:
- Java consumes quite a bit of CPU during startup, so we've increased CPU requests to 1250m to lessen the cold start latency.
- Once the app is running, CPU usage drops back to ~1%, but the HPA is still pushing for more pods based on memory usage, leading to resource waste.
- Additionally, the first request is significantly slower (500ms) due to class loading, while subsequent requests are much quicker (80ms).
So here's what I'd like to know:
- What are the best practices for tuning CPU and memory requests and limits for Java services in Kubernetes, especially when CPU usage spikes only during startup?
- Should I consider decoupling HPA from memory scaling and focus solely on CPU or custom metrics?
- Any JVM flag recommendations (like MaxRAMPercentage or tuning GC) for better performance on EKS?
Thanks a bunch for any insights or experiences you could share!
2 Answers
For your CPU settings, I’d suggest keeping your request lower and not setting a limit. This way, your service can take what it needs during startup without being over-allocated later. Remember, the HPA scales based on requests, not limits. Since your 80% memory target is calculated from your requests, consider increasing that threshold to around 90-95% or bumping your memory request higher. That should help with the scaling issues you’re encountering.
My approach is often to set requests equal to what my CPU and memory limits would be. If you set a CPU limit too low, Kubernetes may throttle your app, causing brief pauses which might impact performance. For memory, I use the Java options -Xms and -Xmx for the correct configurations, and keep memory limits unset to prevent OOM kills.
Related Questions
How To: Running Codex CLI on Windows with Azure OpenAI
Set Wordpress Featured Image Using Javascript
How To Fix PHP Random Being The Same
Why no WebP Support with Wordpress
Replace Wordpress Cron With Linux Cron
Customize Yoast Canonical URL Programmatically