How can I ensure data consistency when deleting a large number of items in DynamoDB?

0
11
Asked By TechieNinja77 On

I'm facing a challenge with deleting 100,000 items from DynamoDB and I need a solution that guarantees consistency. I know that TransactWriteItems is limited to 100 items, and I want to avoid data inconsistency in case any delete operation fails during the process. Since I couldn't implement a solution using a Global Secondary Index (GSI), I'm looking for the best way to handle bulk deletions manually while ensuring everything is consistent.

5 Answers

Answered By CodeWarrior88 On

I encountered a similar issue where I had to delete millions of records. What I did was create a separate table for 'DeleteMarkers'. When the user triggered the delete, I'd record the timestamp in that table. Then, when querying transactions, I'd check against those DeleteMarkers to know which transactions to ignore, while deleting them in the background using TTL. It wasn’t perfect, but it worked for my needs without causing inconsistencies.

Answered By DataGuru99 On

DynamoDB doesn't natively support atomic delete operations at that scale. You can try adding a 'deleted=true' attribute to your items and filter them out in your queries, or use BatchWriteItems while keeping a log of which batches succeeded. If something fails, you'll just resume from where you left off rather than starting all over again. But if you really need that all-or-nothing approach for 100k records, you might want to reconsider using a relational database instead.

QueryMaster23 -

I totally get it, sometimes it feels overwhelming when your project relies heavily on interconnected data. Making a switch to a relational database might save you a lot of headaches in the long run.

Answered By DesignDynamo On

It seems like you’re in a tough spot! One option could be to update your records with an 'obsolete' flag and then clean them up later. Just remember, this approach may require changes in your application logic to accommodate that 'obsolete' check. But this whole scenario shows that careful planning is critical when working with DynamoDB.

UserUnknown01 -

Agreed! It’s crucial to plan these things in advance. If your project requires complex relationships, you might be right in thinking that a relational database could be a better fit.

Answered By FlexibleCoder On

Could you consider ignoring unwanted items at read-time instead? If they share a common attribute, filtering them during reads makes them functionally 'deleted'. Then, you can manage deletions in the background without affecting your read performance too much.

Answered By DevGuru42 On

Honestly, it sounds like you might want to reconsider your database choice. After using DynamoDB for several years, we switched to RDS Aurora. While DynamoDB has its perks for quick prototyping, it can lead to complications with maintenance and consistency over time.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.