So, so I have a dynamodb table with a primary partition key column, foo_id
and no primary sort key. I have a list of foo_id
values, and want to get the observations associated with this list of ids.
I figured the best way to do this (?) is to use batch_get_item()
, but it's not working out for me.
# python code
import boto3
client = boto3.client('dynamodb')
# ppk_values = list of `foo_id` values (strings) (< 100 in this example)
x = client.batch_get_item(
RequestItems={
'my_table_name':
{'Keys': [{'foo_id': {'SS': [id for id in ppk_values]}}]}
})
I'm using SS
because I'm passing a list of strings (list of foo_id
values), but I'm getting:
ClientError: An error occurred (ValidationException) when calling the
BatchGetItem operation: The provided key element does not match the
schema
So I assume that means it's thinking foo_id
contains list values instead of string values, which is wrong.
--> Is that interpretation right? What's the best way to batch query for a bunch of primary partition key values?
You can have several items with same primary key when table has sort key. If table has only primary key without sort key - then no. If table has sort key, then each primary & sort keys combo must be unique.
The BatchGetItem operation returns the attributes of one or more items from one or more tables. You identify requested items by primary key. A single operation can retrieve up to 16 MB of data, which can contain as many as 100 items.
Preventing overwrites of an existing attribute If you want to avoid overwriting an existing attribute, you can use SET with the if_not_exists function. (The function name is case sensitive.) The if_not_exists function is specific to the SET action and can only be used in an update expression.
There should only be one sort key defined per table. But, it can be composed using multiple columns.
Boto3 now has a version of batch_get_item
that lets you pass in the keys in a more natural Pythonic way without specifying the types.
You can find a complete and working code example in https://github.com/awsdocs/aws-doc-sdk-examples. That example deals with some additional nuances around retries, but here's a digest of the parts of the code that answer this question:
import logging
import boto3
dynamodb = boto3.resource('dynamodb')
logger = logging.getLogger(__name__)
movie_table = dynamodb.Table('Movies')
actor_table = dyanmodb.Table('Actors')
batch_keys = {
movie_table.name: {
'Keys': [{'year': movie[0], 'title': movie[1]} for movie in movie_list]
},
actor_table.name: {
'Keys': [{'name': actor} for actor in actor_list]
}
}
response = dynamodb.batch_get_item(RequestItems=batch_keys)
for response_table, response_items in response.items():
logger.info("Got %s items from %s.", len(response_items), response_table)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With