I'm currently updating our disaster recovery plan for our organization, which heavily relies on Azure technologies such as SQL Managed Instance, Web App Services, Blob Storage, several VMs, and Key Vault. While we have good strategies in place for most outage scenarios, I've hit a potential worst-case: what if we completely lose access to our Azure account, either due to a security incident or an error from Microsoft?
We've taken some precautions, like backing up our Blob Storage data externally to S3 and having the ability to spin up VMs in different Azure subscriptions or cloud services. For Web App Services, we can run our applications on an alternative subscription or provider too.
However, the challenges arise with Key Vault and SQL Managed Instances. Our SQL Managed Instance is encrypted with a key stored in Key Vault, and while we do daily COPYONLY backups to Blob Storage, those backups are encrypted with that Key Vault key. According to the documentation, if we lose access to our subscription, we can't restore the backups elsewhere since they're encrypted. I'm curious to hear how others are addressing this risk and their disaster planning strategies?
3 Answers
It's crucial to have a backup strategy that doesn't rely solely on Azure. Regularly exporting your Key Vault keys and secrets to a secure format could add another layer of security. Consider regular drills to ensure your disaster recovery procedures are up to date. Plus, engaging with Microsoft support proactively can often uncover solutions or best practices tailored to your setup.
I'm actually going through a similar exercise right now. Earlier this year, we did a practice run which went well. We're also in talks with Microsoft to get clarity on Key Vault access in the event of a subscription loss, and I'll share any useful insights we gain from that conversation!
One effective solution is to set up a so-called "break glass" account, which is completely independent from any external identity like on-prem AD. This account is only used when everything else fails, ensuring you still have access in a crisis. I've added some recommended security settings and sign-on alerts for it. You can check out more details in this Microsoft guide: [link].
We've already established a break glass account as per Microsoft's recommendations. It’s provisioned with the onmicrosoft.com domain and uses a FIDO key, plus we have alerts set up for any access. But I'm still worried about the overall access to the subscription if things go south.