Hey folks! I'm currently facing a major challenge with our UI that pulls data directly from Elasticsearch, which is running up costs to about $110,000 per month! We have around 200TB of AWS storage allocated, but 130TB is already in use.
We've realized that we've been indexing way too many fields, most of which we don't actually need. So, to cut costs, we're planning to index only the essential fields for UI filtering, which we estimate will reduce our data size by about 90%.
The new approach is to keep complete JSON documents in S3. The plan is as follows:
- When a user applies filters, we fetch the necessary data from Elasticsearch.
- When they want to see the full dataset, we retrieve it from S3.
Currently, we handle around 700,000 calls to Elasticsearch each month. I'm curious, does this approach sound reasonable? Any insights would be really helpful!
3 Answers
It sounds like a solid plan to limit your indexed fields to the essentials, especially considering your high costs! You might want to think about your cluster size and how the adjustments will change that makeup after your field reduction. Using S3 is a budget-friendly move, but just remember it won't have the same performance level for queries. Is there any specific part of your data set that's more important for searches, or do you need to search through everything every time?
If you're on AWS’s managed OpenSearch, check out the remote store feature. It allows primary data to remain on disk while keeping a copy on S3. It could be a great hybrid approach for you! Here's a detailed link: [AWS OpenSearch Remote Store Feature](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/or1.html).
With your setup, if you're mainly doing basic exact-match filtering in the UI, you might want to consider if you really need Elasticsearch's full-text search. If your read throughput is low now, maybe a database would be better for optimizing costs and only paying for storage. Is your write throughput also on the lower side?

Related Questions
Biggest Problem With Suno AI Audio
Ethernet Signal Loss Calculator
Sports Team Randomizer
10 Uses For An Old Smartphone
Midjourney Launches An Exciting New Feature for Their Image AI
ShortlyAI Review