I'm facing a challenge with deleting 100,000 items from DynamoDB and want to do it in a way that guarantees consistency—essentially an 'all-or-nothing' approach. However, I've learned that the TransactWriteItems function only supports up to 100 items at a time. I'm concerned about data inconsistency if the delete operation fails midway. I've also looked into Global Secondary Indexes (GSI), but I couldn't find a viable solution that works with them. What strategies should I consider for achieving consistent deletions?
5 Answers
Honestly, after using DynamoDB for six years, I found it increasingly hard to maintain, especially for complex projects. We eventually migrated to RDS Aurora because, while DynamoDB seemed great for rapid prototyping, its limitations on consistency and maintenance really became a pain.
I had a similar issue where I needed to delete a ton of transactions based on a timestamp that I didn't know in advance. The solution I found was to create a secondary table called "DeleteMarkers." When the user action took place, I stored the timestamp there and then queried both that table and the main transaction table to filter out the transactions that needed deletion. This way, I handled most of it in the background without compromising the performance during reads.
If real deletions are becoming too cumbersome, have you considered just ignoring unwanted items during read operations? If they share some common attributes, you could filter them out in the read process. That way, you can handle deletions as you see fit without affecting your read performance.
DynamoDB doesn’t offer native support for consistent bulk deletions on that scale. One option is to add an attribute like `deleted=true` and filter it out in your queries later. Another is to use the BatchWriteItems method, keeping track of which batches are completed, so if a failure occurs, you can resume instead of restarting from scratch. Just keep in mind that if you're really looking for true all-or-nothing deletions for such a large number of records, you might want to consider using a relational database instead as DynamoDB isn't built for that kind of heavy lifting.
This requirement you're facing raises some red flags for me. It sounds like something might be fundamentally wrong with the way your project is set up. One theoretical approach could be to add an "obsolete" field to your records, set that field when you're ready to consider those records deleted, and then let a TTL manage the cleanup later. But adapting your software to recognize that obsolete field could be tricky, especially if you have multiple systems relying on timely data.

Related Questions
How To: Running Codex CLI on Windows with Azure OpenAI
Set Wordpress Featured Image Using Javascript
How To Fix PHP Random Being The Same
Why no WebP Support with Wordpress
Replace Wordpress Cron With Linux Cron
Customize Yoast Canonical URL Programmatically