In light of the recent Azure outage, which led Microsoft to recommend using Traffic Manager for rerouting, I'm looking for strategies to handle failover when Azure Front Door (AFD) goes down—especially since AFD hosts critical URLs for production applications. What are some tried-and-true solutions for ensuring failover during outages?
7 Answers
We use Imperva as our primary with Azure Front Door as the main origin, and Cloudflare as a secondary. Typically, DNS points to Imperva, which then routes to AFD most of the time, only switching to Cloudflare during outages. However, if Azure's backend goes down, we're pretty much in trouble and need to boost our DR budget!
Doesn't routing through Imperva to Azure Front Door seem a bit convoluted? Why not route Imperva or Front Door directly to Azure services and just flip the DNS to another service during outages?
Would it have made sense to bypass Front Door and go straight from Traffic Manager to an App Gateway today? I don’t have much info on Front Door, but I know an App Gateway in the East US was handling WAF with no issues during the downtime.
We've seen issues with Azure's CDN for the Python SDK not working during the outage. If your backend infrastructure is down, why isn't there an automatic failover? This seems problematic.
Can't you just install the necessary packages from PyPI? Also, the source code is on GitHub, so that’s an option.
I’m a bit puzzled about the Traffic Manager recommendation. What if it goes down too? Isn’t it just another point of failure? I get that Traffic Manager is layer 7, but if part of the DNS fails, wouldn't it impact resolving the origin? I know there’s an explanation, but I can’t seem to find it.
We rely on a lot of custom routing rules in AFD, which Traffic Manager doesn’t support, making it tough for us to switch over. Our choices appear limited—perhaps moving these rules to the app itself or setting up a backup like Cloudflare or Akamai could work. You could run Traffic Manager in front of them or change DNS during an outage.
After recent Front Door outages, we're seriously considering switching entirely to Cloudflare. We've been debating it, but now it feels necessary. Keeping a failover CDN is costly; I'm curious if anyone has cheaper alternatives.
We're also exploring options here. It's crucial to understand how DNS, custom domains, and SSL impact our failover strategies. You might find useful insights in this documentation about global routing redundancy for web applications: [Global routing redundancy for mission-critical web applications - Azure Architecture Center | Microsoft Learn](https://learn.microsoft.com/en-us/azure/architecture/guide/networking/global-web-applications/overview?tabs=cli).

Which specific service are you using in Imperva? I couldn’t find the right match on their website.