I'm working on an API proxy that takes in requests from a source system and processes queries with DynamoDB before forwarding requests to a target API. During peak times, the source system could send 100+ requests per second, but the target API has a strict limit of, say, 3 requests per second. If we exceed this rate, requests get dropped with an error, which isn't ideal. There could also be times when there are no requests for about an hour. Ideally, I want the requests to be sent to the target API within a minute or two of being received. I need a way to throttle the outgoing requests to match the target API's rate while still being able to handle bursts in incoming requests. I'm seeking advice on how to set this up in AWS effectively and any potential pitfalls to watch out for. Thanks for your help!
1 Answer
I recommend you start by funneling the incoming requests from the source system into an SQS queue. Then, create a separate Lambda function that polls this queue, processing requests at a controlled rate that matches the target API's limits. Make sure to implement robust error handling and consider how the source system receives a response, since it sounds like this might be asynchronous. Also, don't forget to set the Lambda concurrency to 1 to avoid sending requests too quickly!
I was worried about costs too, but I found long polling makes it pretty affordable. You'll want to test different polling frequencies based on demand to optimize costs.
That's a solid approach! Don't forget to consider using long polling and adjusting your batch window size. This will help you build in a delay for the Lambda calls without incurring extra costs when waiting.