I'm new to a startup and now responsible for managing our Kubernetes infrastructure, which runs a multi-tenant, multi-region SaaS platform. We have a simple architecture where each customer has one web app pod and three database pods (two replicas and one primary). However, the current setup seems inefficient since the database replicas are not utilized for read operations, and we're facing challenges with visibility and resource allocation. I'm looking to address the "Postgres problem" with some proposed changes: 1) Modify the app to write to the primary and read from replicas; 2) Use specific hosts for Postgres with taints/tolerations; 3) Redesign the database schema for multi-tenancy. Some are saying these changes are too costly or complex. I'm seeking feedback on whether my suggestions are valid and realistically achievable given our budget constraints. Am I missing anything obvious?
3 Answers
Your suggestions are solid, but they might not come into play until there's a push for more investment in reliability. One key issue seems to be redundant data storage across customer databases, leading to wasted space. With Kubernetes, consider that pod scheduling is influenced by the current workload, which can lead to power users monopolizing resources. The ideal setup would allow for more flexibility by utilizing idle nodes. You may need to manage limits on the pods or explore making better use of the available resources.
Definitely worth exploring how Kubernetes allocates resources. Smart scheduling could alleviate some pressure on your databases.
You're on the right track with your instincts! However, you might be overscoping things. It's not as complicated as it sounds—you don't need all three of your proposals to make a significant impact. Changing the application's database connection to use the read-only service for replicas is a small config change and shouldn't be seen as unreasonable. Also, consider utilizing `podAntiAffinityType: required` to better distribute your database instances across the nodes. For now, you don't even need new nodes; you can relabel your existing ones to prioritize Postgres. Additionally, the CNPG PgBouncer Pooler could really help optimize connection management without major changes, and look into cluster hibernation for lighter tenants to save resources! Your proposal for a multi-tenant schema would be great long-term, but tackling the other fixes first might help with the current budget limitations.
That’s good advice! Also, it’s crucial to assess your current database load. Alerts for slow application responses can help identify if you're hitting resource limits.
Thanks for the input! It sounds like we need to focus on prioritizing changes that can be implemented quickly without additional costs.
Don’t forget to clearly define "the postgres problem" you're facing. Without specifying it, it’s hard to guide effectively. If it’s mainly data duplication or performance, that might direct you to different solutions. Either way, using CNPG seems like a good bet for streamlining your management.
Thanks! The post is mainly about inefficient resource allocation and avoiding data duplication across customer setups.
Identifying the exact problem is key to finding the right solution. You’re doing great!

That’s an insightful way to look at it! We do have a lot of overlapping data, so reducing redundancy would have a big impact.