I'm using Azure Container Apps with HTTP autoscaling set to maintain 10 concurrent users for generating reports. However, I've encountered issues where replicas get terminated during scale up or down, causing reports to fail in the middle of execution. I'm curious if this approach is suitable for long-running tasks on Azure Container Apps. Are there any key considerations regarding Service Bus lock timeouts that I should be aware of?
2 Answers
Be sure to test if your Docker image can start and stop gracefully. If it can handle shutdowns correctly, that would make a big difference in managing long-running processes effectively without losing data during scaling events.
It's best to separate your API from the processing tasks. Instead of having the API handle job execution directly, you should have it drop a message onto a Service Bus queue and return a job ID immediately. Then, use worker containers to pull messages from the queue. This way, you can scale these workers based on queue depth using KEDA instead of relying on HTTP concurrency. That way, your workers won’t get terminated while they’re still processing jobs.

Absolutely, that's the right approach! Mixing HTTP-triggered scaling with long-running tasks can lead to issues since the auto-scaler doesn't know a report is still being generated when it decides to terminate a replica. Queue-based workers with KEDA scaling on queue depth is definitely the way to go.