Hey everyone! I'm currently investigating a .NET web application that's crucial for our operations. We're experiencing some annoying intermittent issues, where the app seems to pause or slow down for about 10 to 45 seconds during peak usage. This is causing problems for multiple applications that rely on it. Users are reporting delays and unresponsiveness when they try to fetch data.
During these slowdowns, I've noticed that the app's CPU time and available memory drop to zero, alongside a significant decrease in connections—from about 6,000 down to just 2,000.
One confusing aspect is that when we look into the detailed traces of the delayed requests, other operations complete quickly, but there's often a 10-second gap where it seems like the app is doing nothing.
We did manage to fix some async-over-sync coding issues, but unfortunately, the problem still persists. Any insights or ideas would be much appreciated!
Also, just found out that there's a function app sharing the same service plan which spikes in execution count (20m at its peak!) right around the times we experience these slowdowns. That's a lot! Thanks in advance for any tips!
4 Answers
From my experience, many issues like this usually stem from how HTTP calls are managed. If they're not reused or disposed of properly, it can lead to bottlenecks. I’d recommend taking a memory dump and analyzing it on a dev machine; you can access this through the diagnostics section. It could help you pinpoint what’s causing the problems.
Welcome to app services! It's a pretty tricky environment—it feels like a black box at times without much observability. Good luck!
Haha, I’m definitely learning that the hard way over here!
You should use Application Insights to analyze network and resource performance. This could help you identify where the bottlenecks are, and then you can focus your efforts on resolving those specific issues.
I’ve been using Application Insights a lot, especially its profiling features. Just worried I might be missing something critical or chasing after the wrong things. Thanks for your thoughts!
One thing you might want to try is increasing the minimum thread pool size, especially if the number of tasks is an issue. Also, make sure you’re using server garbage collection mode. It's crucial for handling lots of requests. And don’t forget to check if you're creating new instances of HttpClient unnecessarily; this can really slow things down. Oh, and it’s worth checking SNAT issues as well!
Yeah, I was suspecting that too! We process many calls that aren’t CPU-heavy but come in high volumes, so it makes sense. Thanks for the HttpClient tip, I’ll definitely investigate that further. I’ve also set up a NAT gateway because we were running out of ports.
I’ve been tracing issues and I think I'm getting closer to the root cause. The app has been flagged for high handle and thread counts, so I suspect there's something off with those HTTP calls as well. Gonna check the dump once I can sort it out!