Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between scan and query in dynamodb? When use scan / query?

A query operation as specified in DynamoDb documentation:

A query operation searches only primary key attribute values and supports a subset of comparison operators on key attribute values to refine the search process.

and the scan operation:

A scan operation scans the entire table. You can specify filters to apply to the results to refine the values returned to you, after the complete scan.

Which is best based on Performance and Cost Considerations.

like image 256
samson Avatar asked Apr 17 '17 13:04

samson


People also ask

What is Scan query?

A scan query is a simple search query used to retrieve data from a cache in a distributed manner. When executed without parameters, a scan query returns all entries from the cache.

What is the difference between getItem and query in DynamoDB?

getItem retrieve via hash and range key is a 1:1 fit, the time it takes (hence performance) to retrieve it is limited by the hash and sharding internally. Query results in a search on "all" range keys. It adds computational work, thus considered slower.

How does Scan work in DynamoDB?

A Scan operation in Amazon DynamoDB reads every item in a table or a secondary index. By default, a Scan operation returns all of the data attributes for every item in the table or index. You can use the ProjectionExpression parameter so that Scan only returns some of the attributes, rather than all of them.


1 Answers

When creating a Dynamodb table select Primary Keys and Local Secondary Indexes (LSIs) so that a Query operation returns the items you want.

Query operations only support an equal operator evaluation of the Primary Key, but conditional (=, <, <=, >, >=, Between, Begin) on the Sort Key.

Scan operations are generally slower and more expensive as the operation has to iterate through each item in your table to get the items you are requesting.

Example:

Table: CustomerId, AccountType, Country, LastPurchase  Primary Key: CustomerId + AccountType 

In this example, you can use a Query operation to get:

  1. A CustomerId with a conditional filter on AccountType

A Scan operation would need to be used to return:

  1. All Customers with a specific AccountType
  2. Items based on conditional filters by Country, ie All Customers from USA
  3. Items based on conditional filters by LastPurchase, ie All Customers that made a purchase in the last month

To avoid scan operations on frequently used operations create a Local Secondary Index (LSI) or Global Secondary Index (GSI).

Example:

Table: CustomerId, AccountType, Country, LastPurchase  Primary Key: CustomerId + AccountType GSI: AccountType + CustomerId LSI: CustomerId + LastPurchase 

In this example a Query operation can allow you to get:

  1. A CustomerId with a conditional filter on AccountType
  2. [GSI] A conditional filter on CustomerIds for a specific AccountType
  3. [LSI] A CustomerId with a conditional filter on LastPurchase
like image 176
Kinman Avatar answered Sep 17 '22 19:09

Kinman