Simple example of retrieving 500 items from dynamodb using Python

Tags:

Looking for a simple example of retrieving 500 items from dynamodb minimizing the number of queries. I know there's a "multiget" function that would let me break this up into chunks of 50 queries, but not sure how to do this.

I'm starting with a list of 500 keys. I'm then thinking of writing a function that takes this list of keys, breaks it up into "chunks," retrieves the values, stitches them back together, and returns a dict of 500 key-value pairs.

Or is there a better way to do this?

As a corollary, how would I "sort" the items afterwards?

346

asked Aug 25 '12 12:08

ensnare

1 Answers

Depending on you scheme, There are 2 ways of efficiently retrieving your 500 items.

1 Items are under the same `hash_key`, using a `range_key`

Use the query method with the hash_key
you may ask to sort the range_keys A-Z or Z-A

2 Items are on "random" keys

You said it: use the BatchGetItem method
Good news: the limit is actually 100/request or 1MB max
you will have to sort the results on the Python side.

On the practical side, since you use Python, I highly recommend the Boto library for low-level access or dynamodb-mapper library for higher level access (Disclaimer: I am one of the core dev of dynamodb-mapper).

Sadly, neither of these library provides an easy way to wrap the batch_get operation. On the contrary, there is a generator for scan and for query which 'pretends' you get all in a single query.

In order to get optimal results with the batch query, I recommend this workflow:

submit a batch with all of your 500 items.
store the results in your dicts
re-submit with the UnprocessedKeys as many times as needed
sort the results on the python side

Quick example

I assume you have created a table "MyTable" with a single hash_key

import boto

# Helper function. This is more or less the code
# I added to devolop branch
def resubmit(batch, prev):
    # Empty (re-use) the batch
    del batch[:]

    # The batch answer contains the list of
    # unprocessed keys grouped by tables
    if 'UnprocessedKeys' in prev:
        unprocessed = res['UnprocessedKeys']
    else:
        return None

    # Load the unprocessed keys
    for table_name, table_req in unprocessed.iteritems():
        table_keys = table_req['Keys']
        table = batch.layer2.get_table(table_name)

        keys = []
        for key in table_keys:
            h = key['HashKeyElement']
            r = None
            if 'RangeKeyElement' in key:
                r = key['RangeKeyElement']
            keys.append((h, r))

        attributes_to_get = None
        if 'AttributesToGet' in table_req:
            attributes_to_get = table_req['AttributesToGet']

        batch.add_batch(table, keys, attributes_to_get=attributes_to_get)

    return batch.submit()

# Main
db = boto.connect_dynamodb()
table = db.get_table('MyTable')
batch = db.new_batch_list()

keys = range (100) # Get items from 0 to 99

batch.add_batch(table, keys)

res = batch.submit()

while res:
    print res # Do some usefull work here
    res = resubmit(batch, res)

# The END

EDIT:

I've added a resubmit() function to BatchList in Boto develop branch. It greatly simplifies the worklow:

add all of your requested keys to BatchList
submit()
resubmit() as long as it does not return None.

this should be available in next release.

131

answered Oct 10 '22 20:10

yadutaf

Related questions
                            
                                How do I access dictionary keys that contain hyphens from within a Django template?
                            
                                Are classless methods in Python useful for anything?
                            
                                Rock paper Scissors bot algorithm
                            
                                Making Django go green
                            
                                Catching ArgumentTypeError exception from custom action
                            
                                Setting up TkHtml (a Tk widget) with Python
                            
                                how to change a function in existing 3rd party library in python
                            
                                Decrypting in Python an string encrypted using .NET
                            
                                python randomly sort items of the same value
                            
                                Python ElementTree won't convert non-breaking spaces when using UTF-8 for output
                            
                                how do I check that two slices of numpy arrays are the same (or overlapping)?
                            
                                Python load 2GB of text file to memory
                            
                                Fitting data to system of ODEs using Python via Scipy & Numpy
                            
                                How to recursively call a macro in jinja2?
                            
                                Tick-labels to span over multiple lines
                            
                                Wildcards in column name for MySQL
                            
                                Axes3d vs axes3d matplotlib
                            
                                pydev debugger: unable to find real location for python 2.7 after OS 10.8 upgrade
                            
                                Error while installing with Python “pip”: Cannot fetch index base URL http://
                            
                                ValueError: invalid literal for int() with base 16: '\x0e\xa3' Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Simple example of retrieving 500 items from dynamodb using Python

Tags:

python

amazon-dynamodb

ensnare

People also ask

1 Answers

1 Items are under the same `hash_key`, using a `range_key`

2 Items are on "random" keys

Quick example

yadutaf

Recent Activity

Donate For Us

Simple example of retrieving 500 items from dynamodb using Python

Tags:

python

amazon-dynamodb

ensnare

People also ask

1 Answers

1 Items are under the same hash_key, using a range_key

2 Items are on "random" keys

Quick example

yadutaf

Related questions

Recent Activity

Donate For Us

1 Items are under the same `hash_key`, using a `range_key`