I'm trying to get to the bottom of some performance issues with my Azure Functions. Currently, I have two functions that are triggered by HTTP requests. Function App 1 gets a request from my Web API, which causes it to send around 42 requests to Function App 2. Function App 2 processes these requests, scales to about 40 instances, and performs some quick calculations, returning responses within 10 milliseconds. However, once the volume of requests spikes to between 1,000 and 15,000, the response times start to increase significantly, with calculations taking longer and longer as if they're just pending. Instead of all 15,000 function instances running concurrently and completing quickly, it can sometimes take up to 10 minutes! I'm wondering if this is due to SNAT port limitations, general concurrency issues, or something else. When I run the same number of requests with simplified calculations, the problem largely disappears, suggesting that Function App 2 isn't scaling properly to handle that many requests simultaneously. I'd love to hear any thoughts or insights on this!
4 Answers
What tech stack are your functions built on? If you're using Python, note that its concurrency might be capped at 1:1, which could affect your throughput. If you're using something else, that could also be relevant, so a bit more context would help!
How are you invoking Function App 2? Are you using HTTP requests again, or maybe queues or blob triggers? Those methods can have their own limitations that might influence performance under load.
Thanks for the input! I'm also using HTTP triggers for both function apps. Do you think switching to a queue might help?
You might want to check the scaling settings of your function app. There's some great documentation on Azure Functions event-driven scaling you could look into. It could give insights on whether it's behaving as expected or if there are limits you're hitting.
Here are some important points to consider: 1) By default, Python doesn't offer 1:1 scaling with functions. 2) You likely won't get close to 15,000 instances due to Azure's consumption model limits (max of 100 for consumption and 1,000 for flex). 3) Even when configured, scaling won't happen instantaneously—requests can finish before it reaches optimal scaling. You might need to explore other options if you're aiming for all requests to complete in under 10 seconds.
Thanks for clarifying! We're on a dedicated app service plan that allows 10-30 instances—we're currently at 10, and moving to 30 didn't seem to help. Are you suggesting that reaching my goal may not be feasible with function apps? Should I consider moving to an app service or using VMs for handling requests more effectively?

Thanks for your reply! Both function apps are indeed built using Python and are triggered by HTTP requests.