I've been experimenting with AgentCore for my company, and I've encountered a potential problem—though it might be user error. When I start an agent session using the 'InvokeAgentRuntime' function, it seems like memory usage remains high even after I call 'StopRuntimeSession'. I set the idle timeout to 60 seconds, but after a minute without new invocations, the RAM usage still doesn't drop back to zero. I'm tracking the memory consumption both through the runtime interface and via CloudWatch for GenAI. To make things more complicated, there's no API available to retrieve active runtime sessions. Has anyone else run into this issue?
4 Answers
How are you deploying your runtime? We're starting to use AgentCore too and it's crucial for us to monitor this. Make sure you've defined both idle time and max time for your runtimes to avoid stuck sessions.
You might consider setting the 'maxLifetime' to a lower value, if that works within your usage model. Also, do you have any logs from your 'InvokeAgentRuntime' attempts? It could help to see if the response is what you expect or if it's hanging until a timeout. Is your agent set up to run asynchronously?
I've found out that there are two key variables for controlling session length: idle timeout and max time. If you adjust these while a session is running, they won't change the ongoing session. It took eight hours for mine to time out naturally, which isn't ideal.
Sounds like it might be an AWS issue. You could try tearing down any lambda warmers manually if those are keeping your instances alive. Also, check if there's session state being cached in your config. Some people find it helpful to run cleanup scripts on a regular basis. If you're considering alternatives, HydraDB has a different approach to session cleanup, though it might take more effort to migrate. It could also be worth reaching out to support since AgentCore is still fairly new.
What do you mean by 'lambda warmers'? Are they related to AgentCore?
Have you waited a bit longer to see if the memory usage goes down eventually? It could be that the runtime is 'warming up' and just holding onto the resources for a bit while it waits for more requests.
If that's the case, it's a bit concerning since it isn't documented. It could definitely add up and cost you more than expected!

Yes, it's tricky. If a session gets stuck and you don't have its ID, there's no way to kill it except waiting. Deleting the runtime doesn’t stop the sessions either.