Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pagination in Amazon DynamoDB using Boto

How do I paginate my results from DynamoDB using the Boto python library? From the Boto API documentation, I can't figure out if it even has support for pagination, although the DynamoDB API does have pagination support.

like image 433
Anand Avatar asked Oct 31 '12 18:10

Anand


People also ask

Does DynamoDB support pagination?

DynamoDB paginates the results from Query operations. With pagination, the Query results are divided into "pages" of data that are 1 MB in size (or less). An application can process the first page of results, then the second page, and so on.

What is boto3 pagination?

Paginators are a feature of boto3 that act as an abstraction over the process of iterating over an entire result set of a truncated API operation.

How do you count in DynamoDB?

To get the item count of a dynamodb table, you have to: Open the AWS Dynamodb console and click on your table's name. In the Overview tab scroll down to the Items summary section. Click on the Get live item count button.

What is difference between scan and query in DynamoDB?

DynamoDB supports two different types of read operations, which are query and scan. A query is a lookup based on either the primary key or an index key. A scan is, as the name indicates, a read call that scans the entire table in order to find a particular result.


2 Answers

Boto does have support for "pagination" like behavior using a combination of "ExclusiveStartKey" and "Limit". For example, to paginate Scan.

Here is an example that should parse a whole table by chunks of 10

esk = None

while True:
    # load this batch
    scan_generator = MyTable.scan(max_results=10, exclusive_start_key=esk)

    # do something usefull
    for item in scan_generator:
        pass  # do something usefull
    # are we done yet ?
    else:
        break;

    # Load the last keys
    esk = scan_generator.kwargs['exclusive_start_key'].values()

EDIT:

As pointed out by @garnaat, it is possible that I misunderstood your actual goal. The above suggestion allows you to provide pagination like SO does for questions for example. No more than 15 per pages.

If you just need a way to load the whole result set produced by a given Scan, Boto is a great library and already abstracts this for you with no need for black magic like in my answer. In this case, you should follow what he (@garnaat) advises. Btw, he is the author of Boto and, as such, a great reference for Boto related questions :)

like image 135
yadutaf Avatar answered Nov 04 '22 10:11

yadutaf


Perhaps I'm misunderstanding the question but I think you are making it more difficult than it needs to be. If you are using the layer2 DynamoDB interface in boto (the default) it handles the pagination for you.

So, if you want to do a query operation, you simply do this:

import boto

c = boto.connect_dynamodb()
t = c.get_table('mytable')
for item in t.query(hash_key='foo'):
    print item

This will automatically handle the pagination of results from DynamoDB. The same would also work for a scan request.

like image 25
garnaat Avatar answered Nov 04 '22 12:11

garnaat