I'm a developer with a solid background and I'm venturing into building an AI-powered customer service chatbot. About a year ago, I developed one using a LAMP stack on an EC2 instance. My chatbot needs to handle WhatsApp messages by routing them through a backend webhook. It should decide whether a request goes to a "New Customer Sales Agent" or an "Existing Customer Support Agent". The agent will utilize RAG to pull answers from an FAQ using vector embeddings and cosine similarity, and provide these answers through a large language model (LLM). Additionally, it should facilitate order creation and send a payment button back to users. An admin will manage the Q&A for RAG separately.
For the tech stack, I'm considering:
- WhatsApp Official Business API
- PHP webhook for activation
- API keys for direct access to Claude & ChatGPT
- OpenAI small embedding model for RAG
- OpenAI Whisper API for audio message transcription
- OpenAI multi-modal image recognition
- PHP Backend hosted on EC2
I'm looking for advice on a suitable vector database for the RAG system. Also, I've opted for a PHP backend instead of Python, as I'm more familiar with PHP. I initially considered using Python scripts in AWS Lambda and API Gateway, but I worry about potential API timeouts, especially with Whisper and processing RAG. Any suggestions or insights on the infrastructure and tech stack would be greatly appreciated!
3 Answers
You might want to check out S3 vectors for your database needs. I've been meaning to try it for vector storage, and it could fit well into your architecture.
It sounds like your setup could benefit from an event-driven architecture since you're dealing with potentially slow downstream APIs. I recommend going with an architecture that incorporates API Gateway, Lambda, SQS, and ECS with Fargate. This way, you can ensure that you can scale down to zero when not in use, which is quite efficient.
Thanks! This is my first time hearing about Fargate and ASG, so I’ll definitely explore them further.
Have you considered AWS Lex? It's a managed chatbot service by AWS that, while requiring some manual setup for RAG, might streamline some processes for you. Alternatively, AWS Bedrock could be a great fit, as it supports native RAG implementations and offers vector generation and storage options like Titan and OpenSearch.
Bedrock sounds intriguing! I’ll look into it.

I’m also interested in S3; it's something I want to investigate further!