I've got a Spring Boot application that's using Redis as a cache in a UAT environment. Lately, I've been facing some memory spikes, which result in the cache running out of memory, forcing me to restart the pod. I don't have an eviction policy in place, but I do have a cleanup job that deletes unnecessary keys periodically. My memory limit is set to 6GB, and I've been monitoring it through Grafana to see when these out-of-memory occurrences happen. I also have access to Splunk logs for my application, but I can't run commands via Redis CLI. I'm looking for suggestions on how I can identify what's causing these OOM issues. My app usage has gone up recently in UAT, and I suspect this might be contributing. Before making changes in production, I need to gather some evidence.
1 Answer
Have you thought about adding some logging to your cleanup job? It could track how many keys are being cleaned up or how many are in use at any given time. If the app is overloading Redis or the cleanup isn't happening often enough, that might explain the spikes.

Good idea! You could also list all keys and check their sizes. It might not be the best for performance, but at least it gives you the data you need without needing CLI access.