How do I paginate my results from DynamoDB using the Boto python library? From the Boto API documentation, I can't figure out if it even has support for pagination, although the DynamoDB API does have pagination support.
DynamoDB paginates the results from Query operations. With pagination, the Query results are divided into "pages" of data that are 1 MB in size (or less). An application can process the first page of results, then the second page, and so on.
Paginators are a feature of boto3 that act as an abstraction over the process of iterating over an entire result set of a truncated API operation.
To get the item count of a dynamodb table, you have to: Open the AWS Dynamodb console and click on your table's name. In the Overview tab scroll down to the Items summary section. Click on the Get live item count button.
DynamoDB supports two different types of read operations, which are query and scan. A query is a lookup based on either the primary key or an index key. A scan is, as the name indicates, a read call that scans the entire table in order to find a particular result.
Boto does have support for "pagination" like behavior using a combination of "ExclusiveStartKey" and "Limit". For example, to paginate Scan
.
Here is an example that should parse a whole table by chunks of 10
esk = None
while True:
# load this batch
scan_generator = MyTable.scan(max_results=10, exclusive_start_key=esk)
# do something usefull
for item in scan_generator:
pass # do something usefull
# are we done yet ?
else:
break;
# Load the last keys
esk = scan_generator.kwargs['exclusive_start_key'].values()
EDIT:
As pointed out by @garnaat, it is possible that I misunderstood your actual goal. The above suggestion allows you to provide pagination like SO does for questions for example. No more than 15 per pages.
If you just need a way to load the whole result set produced by a given Scan
, Boto is a great library and already abstracts this for you with no need for black magic like in my answer. In this case, you should follow what he (@garnaat) advises. Btw, he is the author of Boto and, as such, a great reference for Boto related questions :)
Perhaps I'm misunderstanding the question but I think you are making it more difficult than it needs to be. If you are using the layer2 DynamoDB interface in boto (the default) it handles the pagination for you.
So, if you want to do a query operation, you simply do this:
import boto
c = boto.connect_dynamodb()
t = c.get_table('mytable')
for item in t.query(hash_key='foo'):
print item
This will automatically handle the pagination of results from DynamoDB. The same would also work for a scan request.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With