Looking for AWS Architecture and Cost Tips for My Language App

0
6
Asked By TechieTurtle99 On

I'm developing a mobile app for practicing Japanese conversations, and I'd love some advice on how to set up the AWS architecture cost-effectively. Here's the tech stack I'm using: the frontend is based on React Native or Flutter, and the backend is built with Django. It utilizes an AI flow that includes Speech-to-Text, AI response via an LLM, and Text-to-Speech using tools like ChatGPT or Gemini.

My app is expected to handle around 1000 concurrent users, with many people potentially hitting the APIs at the same time. I'm particularly interested in understanding:
- What architecture would work best (like EC2, ECS, or Lambda) and how to manage asynchronous operations?
- How to process audio for multiple users concurrently?
- A rough monthly cost estimate for maintaining this setup?
- Common pitfalls to avoid when building a system like this?
Any real-world insights or suggestions would be super helpful!

3 Answers

Answered By CloudGuru72 On

Going serverless is definitely the way to go. You can run your Django app on AWS Lambda, which allows it to handle lots of concurrent users without breaking a sweat. For speech-to-text, consider using Amazon Transcribe's streaming API; it costs about ~2 cents a minute. You can also route that input through Amazon Bedrock for translation, which keeps your data within AWS and cuts down on latencies and fees from other services. Then, use another Lambda function to send the text to Amazon Polly for Text-to-Speech; that should cost around $15 each month for a million characters. If you set everything up right, the total cost should be around $200 a month for a thousand users making about 100 requests a day.

AppDevFanatic88 -

That’s a solid estimate! Good call on using Bedrock; I think it definitely makes a difference cost-wise.

Answered By StartupSage55 On

When you're starting out, it's better not to aim for 1000 concurrent users right off the bat. Focus on one user and stay flexible with your architecture. This way, you can make iterations based on what you learn about your users before scaling up.

Answered By DevNinja64 On

I'd suggest rethinking the flow. If you're planning to use OpenAI's models, you might find it more efficient to go straight with their real-time API instead of doing Speech-to-Text, followed by the LLM, and then Text-to-Speech. Keeping things simple can help reduce costs and speed.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.