I'm working on a Retrieval-Augmented Generation (RAG) app that utilizes external embeddings and large language model (LLM) APIs. My code has grown too complex for AWS Lambda, so I've containerized it and plan to run it using Fargate. I've implemented the vector database logic inside the container as well. I'm seeking advice on the best and most cost-effective way to store embeddings, steering clear of RDS and DynamoDB. EFS crossed my mind, but I'm curious if there's a faster option. Also, can EFS actually store container embedding documents, or is it strictly a file system?
4 Answers
It sounds like your code might be a bit over-engineered if it's too complex for Lambda. What exactly is making it so complicated? Also, have you considered using Pinecone? They have some decent free tier options that could be a good fit for your use case.
We use a production-ready template for our RAG implementation. It’s super flexible but definitely adds complexity to the cloud design!
RDS can be quite reasonable if you find the right setup. Make sure to explore other storage options that align with your needs beyond the default choices.
Have you tried consulting AI for suggestions? Seriously, consider asking ChatGPT for insights—it might offer a new perspective on your storage needs.
I’ve actually been chatting with ChatGPT all day; I'm just looking for some final thoughts!
Are you avoiding a database, or just an AWS-managed one? Depending on your needs, RDS could still work out to be cost-effective if configured correctly.
I'm aiming for the best cost/efficiency possible. Initially, I thought RDS might suit me, but I'm worried about costs.
I’ll definitely check Pinecone out, thanks!