I'm exploring how to best design my DynamoDB table given the following access patterns and requirements. I have a partition key (pk) and a sort key (sk), with attributes including an ID, timestamp, some basic fields (a0 to an), and a JSON field (r). The JSON field is populated later and is less than 200KB, which fits within DynamoDB's item size limits. My access patterns are: 1) Get attributes and/or r using pk and sk; 2) Get the latest item's attributes using just pk; 3) Update a0-n attributes and/or r by pk and sk; and 4) Update a specific value in r by pk and item-id.
I've considered three design options:
1) Single item with all attributes and r as a JSON blob.
2) Multiple items where r creates new items with modified sk.
3) A single item with all attributes and r as a List of Maps.
I'm leaning towards Option 3. Does this sound like the right approach?
2 Answers
Have you thought about using a Local Secondary Index (LSI) for the item-id? It could complement Option 1 by allowing you to query efficiently without needing to populate the index during object creation. Just keep in mind that LSIs have a 10GB limit per partition, so make sure that fits your use case.
One thing to watch out for is using the timestamp as part of your sk. Since timestamps are strings, retrieving a specific item could be tricky if you don’t know the exact value. It might complicate your queries.
But isn't access pattern 2 supposed to help with that? We need the latest entry by pk, so I believe it works fine. Plus, ISO format strings are sortable, and since users usually just poll for the latest, it shouldn't pose too much of an issue.
But how would that work with LSI? Wouldn’t each item_id need its own pk and sk? That resembles Option 2 which has querying disadvantages. Plus, you mentioned having hundreds of item_ids. How would that impact performance?