Hey everyone! I'm working on a project with the following setup: A frontend built in React, a backend using Python with a REST API, and two PostgreSQL databases—one static (Pgvector 1) with about 1GB of PDF files for LLM general knowledge, and another dynamic (Pgvector 2) that grows based on user chat sessions. Each session involves uploading a PDF for storage as embeddings, plus we have a regular PostgreSQL instance for chat conversations and user logins. I'm trying to figure out which AWS EC2 instance family would be the best fit for this configuration, considering I'll soon have around 500 to 1000 users. Any advice or best practices on instance selection would be greatly appreciated!
2 Answers
For your setup, I think the real bottleneck is going to be with PostgreSQL and your pgvector instances rather than React, which can be easily hosted cheaply elsewhere. So, focus on an instance that is memory- and I/O-optimized. A couple of rules to keep in mind: First, separate your concerns—don't try to run the front end, back end, and database on a single EC2 instance if you anticipate that many users. Instead, put React on S3 with CloudFront, separate your backend and databases. For your DB layer, use something like the `r6g.large` or `r6g.xlarge` instances—they handle larger embeddings better, and caching in RAM makes queries faster. For backend services, you might choose a compute-optimized option like the `c6g.medium` or the `m6g.large`. To handle Postgres effectively, definitely use gp3 EBS with high IOPS for better performance, particularly for those vector inserts and queries. A good baseline to start with would be one `r6g.large` for your Postgres/pgvector and one `m6g.large` for the backend. That way, you'll be set up for growth without breaking the bank!
You might want to consider starting with memory-optimized instances like the `r6a` or `r6g` series, especially if you’re okay with trying out Graviton—it's typically cheaper, but just make sure your Python libraries are compatible. These options give you a lot of RAM for the cost. Also, the AWS Compute Optimizer can be super helpful once you have metrics from your usage; it will recommend better instance types based on how your resources are actually being used. For now, I suggest starting small, maybe with an `r6a.large` which has 16GB of RAM, monitor performance, and scale as needed. For storage, I’d recommend using GP3 EBS to meet your needs without those annoying slowdowns during heavy uploads or queries.

Related Questions
How to Build a Custom GPT Journalist That Posts Directly to WordPress
Cloudflare Origin SSL Certificate Setup Guide
How To Effectively Monetize A Site With Ads