I'm drafting a blueprint for DevSecOps in Azure and want to get some insights from those who have real-world experience deploying it in production. Here are some specific points I'd like to discuss:
- In Azure DevOps pipelines, what criteria do you use for blocking versus warning, and what are your reasons?
- How do you manage approvals and environment checks to ensure the system is reliable even under pressure during incidents?
- Are Azure Policy and Defender utilized as build-time gates, runtime detections, or both in your practices?
- What's your established approach for service connections, agents, and accessing Key Vault?
- How do you maintain audit trails to ensure controls have been executed and that approvals are traceable?
I'm also interested in hearing about common pitfalls you've encountered, such as issues with multi-subscription management, untracked console hotfixes, or overdue exceptions without enforceable expirations.
3 Answers
In production, we block anything that could compromise security or system integrity, like failing SAST tests, severe container vulnerabilities, or missing infrastructure policies. Lesser issues, like style concerns or moderate vulnerabilities, typically result in warnings unless they accumulate significantly. If you implement too many hard blocks, users might seek workarounds. For managing approvals, we ensure a combination of environment checks and RBAC; it's critical not to rely solely on manual checks during incidents, thus enforcing policies and scoped service connections instead. We consider Azure Policy and Defender as both early warnings during build time and runtime detection tools for alerting and monitoring. A major pitfall I've experienced is when hotfixes are made in the console during outages and aren’t synchronized back to the IaC, leading to silent drift over time with no one realizing production has changed from what's in the main branch.
I just don't use Azure DevOps at all.
What platform are you using instead of Azure DevOps, and are you still deploying applications to Azure?
In my organization, we mainly rely on pull request approvals as the primary gate. We utilize Infrastructure as Code (IaC) modules that come with sensible defaults for deploying standard resources. We have both Managed Disk (MDC) and CIS benchmark settings enforced through Azure Policy. For critical configurations that we really prioritize, we develop our own policies and initiatives. For service connections, we use GitHub Actions with a read-only identity available for all branches, while the write/apply identity is restricted to the main branch. Each identity has RBAC permissions for the tasks they need, like accessing Key Vault. As for agents, it varies; we use either GitHub-hosted agents or private networks based on specific needs. Keep things straightforward to prevent complications—avoid using key or token-based methods when possible.
Thanks, this is really helpful! Just to clarify on the GitHub Actions you mentioned: do you control access to the read-only and apply identities using only RBAC and branch rules, or are there additional safety nets like environment protections and mandatory reviews for applying changes?

Appreciate the insights! Quick question: how do you manage exceptions so they don’t become permanent waivers? Also, how do you reconcile console hotfixes back into IaC once an outage has passed?