Applications

Guidance for Processing 1000 Files on AWS?

March 11, 2026

Asked By TechieTribe42 On March 11, 2026

I'm developing a document extraction pipeline on AWS for a client where we upload PDFs to S3. This triggers a series of AWS Lambda functions that perform tasks like concatenating PDFs, extracting text using Textract and Bedrock VLM, redacting PII with Comprehend, and finally extracting structured data with Gemini via Fargate. While it's working well with about 10 documents, we need to scale it to handle over 500 documents uploaded in bulk. I'm seeking advice on considerations for this scaling, particularly regarding API rate limits, Lambda concurrency, and whether using Fargate for each file is efficient at scale.

5 Answers

Answered By DevDude221 On March 12, 2026

This setup sounds quite similar to a project I came across! You might want to check this sample solution on GitHub for insights: https://github.com/aws-samples/aws-ai-intelligent-document-processing/tree/main/guidance/prompt-flow-orchestration. It could spark some ideas for scaling.

Answered By LambdaLover99 On March 12, 2026

You shouldn't encounter significant concurrency issues with Lambda or API Gateway, so those should be manageable. For the other services, it might be a good idea to look up their limits just to be sure. If you're concerned about downstream bottlenecks, consider adding an SQS queue to your pipeline for better handling.

Answered By RustyNinja66 On March 12, 2026

Have you thought about writing your Lambda functions in Rust for improved speed? It could potentially give you better performance during processing, though make sure your team is comfortable with Rust first.

LambdaLover99 - March 12, 2026

Not sure that's necessary unless your team is already experienced with Rust. It might add complexity for those unfamiliar with it.

Answered By QueueMaster3000 On March 12, 2026

I recommend just adding a queue to help manage processing in batches. It seems to be a straightforward solution for handling your scaling needs more efficiently.

Answered By CloudGuru88 On March 11, 2026

How urgent are these jobs? If they can be queued and processed in batches, that would help a lot. Also, consider how much memory a single job might need. Lastly, think about whether your service really needs to scale down to zero, or if having a baseline compute capacity makes sense for you.

Guidance for Processing 1000 Files on AWS?

5 Answers

Related Questions

How to Build a Custom GPT Journalist That Posts Directly to WordPress

Fix Not Being Able To Add New Categories With Intuitive Category Checklist For Wordpress

Get Real User IP Without Installing Cloudflare Apache Module

How to Get Total Line Count In Visual Studio 2013 Without Addons

Install and Configure PhpMyAdmin on Centos 7

How To Setup PostfixAdmin With Dovecot and Postfix Virtual Mailbox

LEAVE A REPLY Cancel reply