Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the difference between BatchGetItem and Query in DynamoDB?

I've been going through AWS DynamoDB docs and, for the life of me, cannot figure out what's the core difference between batchGetItem() and Query(). Both retrieve items based on primary keys from tables and indexes. The only difference is in the size of the items retrieved but that doesn't seem like a ground breaking difference. Both also support conditional updates.

In what cases should I use batchGetItem over Query and vice-versa?

like image 824
suv Avatar asked Jun 10 '15 07:06

suv


People also ask

What is the difference between DynamoDB scan and Query?

DynamoDB scans and GSIs. DynamoDB supports two different types of read operations, which are query and scan. A query is a lookup based on either the primary key or an index key. A scan is, as the name indicates, a read call that scans the entire table in order to find a particular result.

What is the difference between GetItem and Query in DynamoDB?

getItem retrieve via hash and range key is a 1:1 fit, the time it takes (hence performance) to retrieve it is limited by the hash and sharding internally. Query results in a search on "all" range keys. It adds computational work, thus considered slower.

What is Query in DynamoDB?

In a Query operation, DynamoDB retrieves the items in sorted order, and then processes the items using KeyConditionExpression and any FilterExpression that might be present. Only then are the Query results sent back to the client. A Query operation always returns a result set.

What is BatchGetItem?

The BatchWriteItem operation puts or deletes multiple items in one or more tables. A single call to BatchWriteItem can transmit up to 16MB of data over the network, consisting of up to 25 item put or delete operations.


1 Answers

There’s an important distinction that is missing from the other answers:

  • Query requires a partition key
  • BatchGetItems requires a primary key

Query is only useful if the items you want to get happen to share a partition (hash) key, and you must provide this value. Furthermore, you have to provide the exact value; you can’t do any partial matching against the partition key. From there you can specify an additional (and potentially partial/conditional) value for the sort key to reduce the amount of data read, and further reduce the output with a FilterExpression. This is great, but it has the big limitation that you can’t get data that lives outside a single partition.

BatchGetItems is the flip side of this. You can get data across many partitions (and even across multiple tables), but you have to know the full and exact primary key: that is, both the partition (hash) key and any sort (range). It’s literally like calling GetItem multiple times in a single operation. You don’t have the partial-searching and filtering options of Query, but you’re not limited to a single partition either.

like image 196
Craig Walker Avatar answered Sep 22 '22 15:09

Craig Walker