I'm running an ECS Fargate Spot task and noticed something odd. According to the documentation, my container is supposed to receive a SIGTERM signal before being interrupted, allowing up to 120 seconds for cleanup based on the stopTimeout I set. However, my task was terminated after just 21 seconds, which makes me wonder if the stopTimeout is being bypassed during spot interruptions or if there's a bug. My task logs show that at 18:08:30, my app logged "Received SIGTERM," but by 18:08:51, it was killed with SIGKILL (exitCode: 137). Has anyone else experienced this, or does anyone know what the correct behavior is supposed to be?
2 Answers
Yeah, my app should be set up to handle SIGTERM properly, but clearly it didn’t work as expected here. I'll review the code to make sure it's reacting appropriately. Thanks for the insight on the timing, though!
From what I understand, it seems like custom stopTimeouts might not be respected during spot interruptions. AWS tends to take back their resources immediately when they need them, prioritizing that over graceful shutdown. Also, ensure that your application properly handles the SIGTERM signal; that could contribute to the early shutdown if the app isn’t cleanly terminating. Read through the documentation again for more details!
Exactly, the SIGTERM handling is crucial. If your app fails to respond properly, it risks getting a SIGKILL, which means data loss could happen.
Actually, the documentation states that Fargate Spot tasks should give you a two-minute warning and time for graceful shutdown, so it sounds like something isn’t working as intended in your case.