Hi everyone! I'm looking for some hands-on advice about building a high availability strategy for a payment platform that requires 100% uptime. My tech stack includes Linux (Ubuntu Server), Apache2 with SSL, a Node.js backend, a PostgreSQL database, a React.js frontend, and 8 systemd services. I'm hosting my domain through Cloudflare with Full Strict SSL/TLS.
I'm considering a full multi-server failover setup with Cloudflare Load Balancer, but I'm not sure how to keep my servers in sync. I've also thought about manual cron daily backups, but that doesn't help if the server goes down completely.
Here are my key questions:
1. How do I sync the primary and backup servers if I use Cloudflare Load Balancer?
2. Do I need to manually replicate changes on the backup server when I adjust the primary one?
3. Can I use tools like Ansible to deploy changes to both servers at the same time?
4. I'm especially worried about keeping the PostgreSQL database and SSL certificates synchronized. The React and Node parts seem easier to manage.
Thanks for any practical advice you can share!
3 Answers
You're actually looking for a high-availability (HA) setup rather than just backups. To achieve that, consider a PostgreSQL cluster along with multiple app server instances behind a load balancer. It’s essential to ensure both your hardware and software can maintain high availability independently.
What you're pursuing falls under high availability, not backup strategies. It's crucial to explore the HA options available with your tech stack to design a robust system.
Getting to 100% uptime is nearly impossible! You need to plan for redundancy from the start, meaning your architecture should be a distributed system rather than relying on a single point of failure.
I get that now — so, what’s the first practical step to shift to this model? Should I separate the database into a managed cluster first?

Exactly! For PostgreSQL HA, look into options like Patroni or repmgr to manage your clusters. And yes, with the Cloudflare Load Balancer, you should be able to handle failover quite well.