Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get most recent data from DynamoDB for each primary partition key?

I have a table in dynamodb. It stores account stats. It's possible that the account stats will be updated several times per day. So table records may look like:

+------------+--------------+-------+-------+
| account_id | record_id    | views | stars |
+------------+--------------+-------+-------+
| 3          | 2019/03/16/1 | 29    | 3     |
+------------+--------------+-------+-------+
| 2          | 2019/03/16/2 | 130   | 21    |
+------------+--------------+-------+-------+
| 1          | 2019/03/16/3 | 12    | 2     |
+------------+--------------+-------+-------+
| 2          | 2019/03/16/1 | 57    | 12    |
+------------+--------------+-------+-------+
| 1          | 2019/03/16/2 | 8     | 2     |
+------------+--------------+-------+-------+
| 1          | 2019/03/16/1 | 3     | 0     |
+------------+--------------+-------+-------+

account_id is a primary partition key. record_id is a primary sort key

How I can get only latest records for each of the account_ids? So from the example above I expect to get:

+------------+--------------+-------+-------+
| account_id | record_id    | views | stars |
+------------+--------------+-------+-------+
| 3          | 2019/03/16/1 | 29    | 3     |
+------------+--------------+-------+-------+
| 2          | 2019/03/16/2 | 130   | 21    |
+------------+--------------+-------+-------+
| 1          | 2019/03/16/3 | 12    | 2     |
+------------+--------------+-------+-------+

This data is convenient to use for a reporting purposes.

like image 269
Alex Fruzenshtein Avatar asked Sep 18 '25 06:09

Alex Fruzenshtein


1 Answers

This can be done quite efficiently if you know the list of account_ids that you have stored in the table.

In that case, all you need to do is to query primary keys, one by one, sorting values using ScanIndexForward=False and limiting the result to 1 item using Limit=1.

Here is code in python

import boto3
import json

client = boto3.client('dynamodb')

account_ids = ['1', '2', '3']
results = []

for aid in account_ids:
    result = client.query(
        TableName='test-table',
        KeyConditionExpression="#aid = :aid",
        ExpressionAttributeNames={
            '#aid': 'account_id'
        },
        ExpressionAttributeValues={
            ':aid': {
                'N': aid
            }
        },
        ScanIndexForward=False,
        Limit=1,
    )
    results.append(result['Items'])

print(json.dumps(results, indent=2))
like image 138
Matus Dubrava Avatar answered Sep 20 '25 18:09

Matus Dubrava