Why are some messages disappearing in Kinesis and EventBridge?

0
1
Asked By TechWhiz83 On

I'm facing a puzzling problem with my Kinesis and EventBridge setup. Here's a quick overview of my infrastructure: I have a serverless function on AWS Lambda acting as the message listener, while a PHP application running on ECS is producing messages to SQS and Kinesis. I've set up an EventBridge pipeline to process and filter these messages before passing them on to a Lambda function. I've also configured retries and a dead letter queue, and I'm logging everything at the trace level.

Most of the time, everything works fine, but every now and then, about 0.5% to 0.8% of my messages (anywhere from 1 to 300) just don't get consumed at all. I can see the messages in Kinesis; they're accepted, and I have the JSON data along with the Kinesis shard ID and sequence number. However, these messages never appear in the log or get processed. The pipeline drops the data after 24 hours, but I have alarms set to notify me if messages are older than an hour, and there's been no alarm triggered.

Despite thousands of messages being processed correctly, a few just seem to vanish. I've checked CloudWatch logs for Lambda and can't find any errors. It feels like my messages are being accepted but then lost somewhere in the pipeline. Does anyone have any insights or suggestions on how I could troubleshoot this further? I'd appreciate any advice aside from moving to a different system altogether!

2 Answers

Answered By CloudGuardian77 On

Just to clarify, Kinesis messages can't be deleted; they will only stop being stored after the retention period. When you mention 'empty Kinesis', are you sure the messages just expired, or are they genuinely not being read by any consumers? If they've been accepted by Kinesis, it’s strange for them to just vanish unless there’s a problem with how the EventBridge pipe is filtering or processing them. Something might be causing the process to skip errors, leading to messages being checkpointed without being fully handled. Could you provide more details on how your EventBridge setup is configured? That could be a key factor in this issue.

Answered By DataDynamo On

It sounds like you might have a rogue client still connected somewhere that could be consuming those few messages. I had a similar issue with Redis a while back. It might be worth checking if anything unexpected is pulling from Kinesis!

CuriousCoder12 -

Thanks for the tip! I’ll definitely look into that. Better safe than sorry!

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.