I'm developing a Java application that loads trades from a database, but I'm facing significant memory issues. The problem arises because accounts have widely varying trade volumes, and during peak trading days, high-volume accounts tend to load simultaneously. This leads to memory exhaustion in the application as it tries to handle too much data at once.
Right now, I'm selecting accounts randomly from a HashSet and loading trades for each account in parallel across 16 threads. But on busy days, this causes my system to run out of memory.
It's important that I load all trades for each account because I cannot change this to a batch approach without major refactoring. The process is time-sensitive and performance-critical, plus I have data on the trade count for each account that could help me gauge memory requirements.
I'm looking for strategies to implement a more memory-efficient and effective way to load the trades. Any advice would be greatly appreciated!
1 Answer
Have you considered using a semaphore to control memory usage? You can create a semaphore with permits equal to your estimated available memory. Each thread would need to acquire permits based on the memory it expects to use before it starts loading data, and release them once done. If the memory isn’t available, the thread will wait until it is. Just keep in mind that Java’s garbage collection might not make memory available immediately. Also, it’s a good idea to check that your threads aren't trying to acquire more permits than are allowed, to avoid deadlocks.
Thanks for that explanation! Just to clarify: would the memory estimate be based on JVM runtime memory? I feel like that could be overly pessimistic. Also, if you suggest leaving headroom below the heap limit, wouldn't that waste a lot of RAM on high-load days?