I'm working in a CI/CD-driven environment using Docker, Kubernetes, and Azure, but I'm finding that database changes are less predictable compared to application deployments. We utilize migrations, yet I still feel that DB changes bring a higher risk, especially since multiple teams are simultaneously pushing changes. I'd love to hear your practical experiences on a few things: Do you isolate database changes into separate pipelines? How do you realistically handle rollbacks? Is schema diff validation required before merging? And how do you discourage the culture of applying quick fixes directly in production? I'm more interested in what has worked for real teams long-term rather than theoretical approaches. What strategies are you using to minimize release risks associated with database changes?
1 Answer
When managing database schema changes, you really need to focus on zero downtime migrations. For example, if you're renaming a column, follow these steps: create a new column, write to both columns, backfill data, switch reads, stop writes to the old column, and finally drop the old column. Yes, it might feel like a lot of small deployments, but that approach ensures safety. You can also check out this project for guidelines on zero downtime migrations, which can be applied across various tech stacks: https://github.com/ankane/strong_migrations.

It's tempting to do everything in one go, but breaking it down into smaller deployments definitely simplifies achieving zero downtime.