I'm working on a hobby project that involves over 2 million documents with various metadata like author, title, description, keywords, and publication year, all stored in a 3GB JSON file. I want to implement a search feature that allows for similarity searches on this data. I've been trying out FAISS for vector similarity searching locally and am considering deploying something similar on AWS. I've looked into OpenSearch, but the pricing seems really high even for serverless options. I also thought about loading my embedding model in Lambda and reading the index from S3, but I'm worried about potential costs and speed for users. I'm reaching out to see if anyone has suggestions for a good balance between affordability and practicality for this type of project.
3 Answers
Just wanted to pop in and say thanks! I hadn't considered RDS with vectors before, mostly relying on Dynamo for everything. I set up a test database and will keep a close eye on my expenses while testing it out.
Using Postgres RDS with the pgvectors extension could be a great solution. It's flexible and you can scale down to zero when you're not using it, which helps with costs. Plus, you can explore using bedrock knowledge base LLM for populating your vector table. Good luck!
You might want to check out Postgres RDS with the vector search extension. It's pretty effective for your needs and won't break the bank. Another alternative is using a memory database, but keep in mind that it tends to be around double the cost for much quicker queries. Honestly, I wouldn't recommend OpenSearch if you're looking for something hobby-friendly.
Related Questions
Glassmorphism CSS Generator with Live Preview
Remove Duplicate Items From List
EAN Validator
EAN Generator
Cloudflare Cache Detector
HTTP Status Code Check