Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS DynamoDB Scan and FilterExpression using array of hash values

I am having a hard time finding a useful example for a scan with FilterExpression on a DynamoDB table. I am using the javascript SDK in the browser.

I would like to scan my table and return only those records that have HASH field "UID" values within an array I pass to the Scan

Lets say I have an array of unique ids that are the hash field of my table I would like to query these records from my DynamoDB table.

Something like below

var idsToSearch=['123','456','789'] //array of the HASH values I would like to retrieve
var tableToSearch = new AWS.DynamoDB();
var scanParams = {
  "TableName":"myAwsTable",  
  "AttributesToGet":['ID','COMMENTS','DATE'],  
  "FilterExpression":"'ID' in "+idsToSearch+"" 

}
tableToSearch.scan(scanParams), function(err,data){
    if (err) console.log(err, err.stack); //error handler
    else console.log(data); //success response
})
like image 333
jotamon Avatar asked May 13 '15 15:05

jotamon


People also ask

Can DynamoDB have multiple hash keys?

Using normal DynamoDB operations you're allowed to query either only one hash key per request (using GetItem or Query operations) or all hash keys at once (using the Scan operation).

Which is faster Scan or Query in DynamoDB?

More complex queries on DynamoDB data are occasionally required. Instead of scanning for such queries, it is usually preferable to create a GSI (global secondary index). Out of interest, I ran an experiment to confirm that Scan operation is indeed slower than Query operation.

What is difference between Scan and Query in DynamoDB?

DynamoDB supports two different types of read operations, which are query and scan. A query is a lookup based on either the primary key or an index key. A scan is, as the name indicates, a read call that scans the entire table in order to find a particular result.

How can I speed up DynamoDB Scan?

Scans are generally speaking slow. To make that process faster, you can use a feature called "Parallel Scans" which divide the whole DynamoDB Table into Segments. A separate thread/worker then processes each Segment so N workers can work simultaneously to go through the whole keyspace faster.


1 Answers

You should make use of the IN operator. It is also easier to use Placeholders for attribute names and attribute values. I would, however, advise against using a Scan in this case. It sounds like you already have the hash key attribute values that you want to find, so it would make more sense to use BatchGetItem.

Anyways, here is how you would do it in Java:

ScanSpec scanSpec = new ScanSpec()
    .withFilterExpression("#idname in (:val1, :val2, :val3)")
    .withNameMap(ImmutableMap.of("#idname", "ID"))
    .withValueMap(ImmutableMap.of(":val1", "123", ":val2", "456", ":val23", "789"));
ItemCollection<ScanOutcome> = table.scan(scanSpec);

I would imagine using the Javascript SDK it would be something like this:

var scanParams = {
  "TableName":"myAwsTable",
  "AttributesToGet": ['ID','COMMENTS','DATE'],
  "FilterExpression": '#idname in (:val1, :val2, :val3)',
  "ExpressionAttributeNames": {
    '#idname': 'ID'
  },
  "ExpressionAttributeValues": {
    ':val1': '123',
    ':val2': '456',
    ':val3': '789'
  }
}
like image 135
mkobit Avatar answered Sep 23 '22 04:09

mkobit