Strange CPU Spikes on My App Service – Need Help!

0
8
Asked By TechWhiz042 On

I've been struggling with a peculiar issue on my production app service that runs 8 instances. I use a second to last level SKU that usually provides more than enough compute and memory. However, I'm experiencing unexpected CPU spikes of 100% lasting for 30 to 60 seconds on various instances. This problem occurs sporadically throughout the hour, but not on more than one instance at a time.

What baffles me is that this issue started showing up on Tuesday while the traffic levels have remained consistent for weeks, and there have been no deployments in that time since everything seemed very stable. Our app service functions as an API that connects with around 10 different external partners via HttpClient, and I can't help but wonder if this might be a root cause. Though I have application insights enabled, I'm still having trouble pinpointing the trigger for these spikes. I've also examined some memory dumps and CPU stacks but haven't found any clues. I'm confident there's no excessive API traffic from third parties, so I could use some advice on how to proceed. Thanks a lot!

5 Answers

Answered By CodeMaster88 On

It sounds like a good idea to check if you're using an HttpClientFactory properly. If not, that could be one source of the issue. Also, one of your downstream APIs may be acting up. I recommend adding some logging around your HttpClient calls to get more visibility.

DevGuru99 -

Exactly! Don't forget to ensure you're disposing of the HttpClient properly. Without logging, it's tough to pin down the problem. If you have end-to-end transactional inspection set up in App Insights, that could give you valuable insights too. You should definitely try to narrow down whether the issue is localized to your app or happening during API calls.

Answered By DevOpsMan32 On

Double-check for SNAT port exhaustion in the app service diagnostics. Your issues sound very familiar to me, and this was a critical point of failure in my experience. If you haven’t already, you could set up auto-heal patterns or health checks in your app service, which could help you ride out the situation for now.

Answered By ServerNinja748 On

While it might not directly relate, we encountered a similar problem due to an Active Directory server that handled DNS duties for several Windows servers. The DNS load was overwhelming the server, and upgrading its SKU solved our problems. Just a thought!

Answered By DataDude55 On

Consider setting up OpenTelemetry on your application. App Insights without it is pretty limited, and OpenTelemetry can provide a much better view of what's happening. It can really help with tracing issues, especially regarding SQL performance.

CodeWhisperer12 -

What are the additional benefits you find with OpenTelemetry? I find App Insights valuable, particularly the tracing feature for SQL performance, but I'm curious about how OpenTelemetry enhances that further.

Answered By CloudExpert21 On

You might want to check what Azure dependencies you're using and how you're authenticating them. Sometimes issues with integration can build up if the authentication gets overwhelmed.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.