I'm working on a system that requires both filter capabilities and full-text search across a large dataset. Currently, I'm using MySQL for transactions and Manticore Search for the full-text aspect. My main concern is finding the fastest database solution for searching through millions of records with up to six filters: two filters based on the `IN()` clause (with around five values), a price range filter, and two static string and date filters. I considered using a composite index in MySQL with proper order and cursor pagination for fetching results in batches of 30, but I worry about performance issues due to the `IN()` clause causing numerous index lookups per query. I'm also exploring the possibility of using Manticore's columnar attributes for filtering, but I'm unsure how it will perform with large datasets. Any suggestions or experiences related to this would be greatly appreciated!
3 Answers
When it comes to querying large datasets with multiple filters, the way you structure your schema and queries is key. Ultimately, it might be worth experimenting with different relational databases, like PostgreSQL or MariaDB, as they often have excellent optimization for read queries. Benchmarking different setups on your specific workload can yield surprising results, even if the platform's raw speed is similar.
You're right to be cautious about MySQL's performance given your criteria. Using B-trees for your complex query needs can present challenges. For efficient multi-filter searches, a columnar database like ClickHouse might be a good fit, especially for aggregations. Manticore could also deliver solid performance depending on how its search capabilities complement your schema.
When dealing with SQL databases, ensure your indexes are well-structured for filtering. The process generally involves building indexes that can effectively identify potentially matching rows and leveraging query plans that optimize the selection process. Both MySQL and PostgreSQL can be efficient if your indexes are set up properly, but PostgreSQL offers a more advanced query planner which might be beneficial for complex scenarios like yours.
I totally agree! PostgreSQL's advanced capabilities can really shine if you manage to align your indexes and queries. It's worth giving it a shot alongside Manticore to compare performance.

Exactly! Combining columnar databases with traditional OLTP systems might give you the best of both worlds. Don’t hesitate to play around with Manticore and see how it handles your data—it might surprise you!