I've been using GitHub for managing releases, which gives us some essential features like branch protection, mandatory reviews, and CI checks, but I've noticed that these don't catch every issue that can arise. I'm curious to hear from the community about any advanced rules or guardrails you've implemented beyond GitHub's defaults. For instance, do you set limits on the size or complexity of pull requests? Do you ensure PRs are directly tied to specific tickets or project goals? Have you automated any checks to ensure the quality of reviews? Also, I'd love to know about any company-wide rules that have significantly improved your release process. If you have any practical examples where these additional governance measures have prevented incidents, please share, especially for situations where GitHub's built-in protections might fall short.
3 Answers
I mainly use GitLab, which scans everything with security prevention features turned on. While it costs money, it caught several issues recently, like the npm problems with the debug module, preventing those deployments.
For me, monitoring key metrics with Argo rollouts is the main thing. My job isn’t to micromanage the release quality; it’s to ensure that it goes out smoothly and that we can roll it back if necessary if something goes wrong.
Is that really enough for services that serve end users? I wonder if that might leave some gaps.
We’ve put some effective guardrails in place that have stopped issues before they escalate:
- **SLO-gated canaries**: With Argo/Flagger, we can automatically pause or roll back releases based on specific error rates.
- **Risk labels and size budgets**: We’ve set thresholds for PRs, so any that exceed 400 lines of code require a rollback plan and a demo before merging.
- **Enforced DB migrations**: We only allow safe migration steps, avoiding destructive changes all at once.
- **Contract tests**: Using Pact helps us catch breaking changes at service boundaries, like mismatched headers.
- **Policy-as-code**: With tools like OPA and Conftest, we enforce rules against things like wildcard IAM roles and ensure proper tagging on Terraform plans.
- **Post-deploy verifications**: We perform synthetic checks and monitor key performance indicators before we’re done with the deployment.
These measures have been lightweight yet effective, preventing issues like a production-drop migration and silent API breaks.
Great list! We also use Warestack, which enforces similar rules, like requiring additional reviews for PRs under 400 LOC. It checks that PRs align with PM objectives and blocks reviews outside of working hours. This system allows exceptions for urgent fixes, so we don’t face unnecessary blockers. We're also looking into more proactive guardrails to catch issues before code goes to production.
I'm on GitHub, but I’ll definitely borrow some ideas from GitLab! Are there specific guidelines you've implemented that ensure safety without slowing the team down?