I'm working on my final year project, where I'm developing an application to shard SQL databases, specifically targeting PostgreSQL. My goal is to optimize point queries with some filtering, avoiding ranges or keyless aggregations to prevent scatter-gather methods. I'm currently implementing hash-based routing, but I'm stuck on how to choose the right sharding key for query routing without knowing the schema in advance. I thought about maintaining a registry to map each key to its corresponding table, but I've noticed that many tables share the same fields that are unique. This could lead to identical sharding keys across multiple tables. I'm looking for an algorithm or method to identify suitable fields across different tables for optimal grouping. Any suggestions?
1 Answer
One approach is to look for a table that's the center of most relationships; use its key as your sharding key. This way, you ensure that most queries benefit from localized data. The other tables either need to be replicated or kept in a standalone database structure. This should help optimize your performance.

Related Questions
How To: Running Codex CLI on Windows with Azure OpenAI
Set Wordpress Featured Image Using Javascript
How To Fix PHP Random Being The Same
Why no WebP Support with Wordpress
Replace Wordpress Cron With Linux Cron
Customize Yoast Canonical URL Programmatically