Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DynamoDB batch execute QueryRequests

I have the following DynamoDB query which returns the first record with the hash apple and time-stamp less than some_timestamp.

Map<String, Condition> keyConditions = newHashMap();

keyConditions.put("HASH", new Condition().
    withComparisonOperator(EQ).
    withAttributeValueList(new AttributeValue().withS("apple")))
);

keyConditions.put("TIMESTAMP", new Condition().
    withComparisonOperator(LE).
    withAttributeValueList(new AttributeValue().withN(some_timestamp)))
);

QueryResult queryResult = dynamoDBClient.query(
    new QueryRequest().
            withTableName("TABLE").
            withKeyConditions(keyConditions).
            withLimit(1).
            withScanIndexForward(SCAN_INDEX_FORWARD)
);

I need to execute many queries of this kind and so my question: is it possible to batch execute these queries? Something like the following API.

Map<String, Condition> keyConditions = newHashMap();

keyConditions.put("HASH", new Condition().
    withComparisonOperator(EQ).
    withAttributeValueList(new AttributeValue().withS("apple")))
);

keyConditions.put("TIMESTAMP", new Condition().
    withComparisonOperator(LE).
    withAttributeValueList(new AttributeValue().withN(some_timestamp)))
);

QueryRequest one = new QueryRequest().
    withTableName("TABLE").
    withKeyConditions(keyConditions).
    withLimit(1).
    withScanIndexForward(SCAN_INDEX_FORWARD);

keyConditions = newHashMap();

keyConditions.put("HASH", new Condition().
    withComparisonOperator(EQ).
    withAttributeValueList(new AttributeValue().withS("pear")))
);

keyConditions.put("TIMESTAMP", new Condition().
    withComparisonOperator(LE).
    withAttributeValueList(new AttributeValue().withN(some_other_timestamp)))
);

QueryRequest two = new QueryRequest().
    withTableName("TABLE").
    withKeyConditions(keyConditions).
    withLimit(1).
    withScanIndexForward(SCAN_INDEX_FORWARD)

ArrayList<String> queryRequests = new ArrayList<String>() {{
    add(one);
    add(two);
}};

List<QueryResult> queryResults = dynamoDBClient.query(queryRequests);
like image 586
Cathal Coffey Avatar asked Jul 25 '14 10:07

Cathal Coffey


1 Answers

From a very similar question in the AWS forums here:

DynamoDB's Query API only supports a single "use" of the index in the query operation, and as a result, the "hash" of the index you're querying has to be specified as an EQ condition. DynamoDB does not currently have any kind of "batch query" API, so unfortunately what you're looking for is not possible today in a single API call. If these were GetItem requests (not suitable for your use case though), you could issue a BatchGetItem request.

In the meantime, since it looks like you're using Java, my recommendation would be to use threads to issue multiple query requests in parallel. Here's some sample code that accomplishes this, but you'll want to consider how you want your application to handle pagination / partial results, and errors:

/**
* Simulate a "Batch Query" operation in DynamoDB by querying an index for 
* multiple hash keys
* 
* Resulting list may be incomplete if any queries time out.  Returns a list of 
* QueryResult so that LastEvaluatedKeys can be followed.  A better implementation 
* would answer the case where some queries fail, deal with pagination (and 
* Limit), have configurable timeouts.  One improvement on this end would be 
* to make a simple immutable bean that contains a query result or exception, 
* as well as the associated request.  Maybe it could even be called back with 
* a previous list for pagination. 
* 
* @param hashKeyValues (you'll also need table name / index name) 
* @return a list of query results for the queries that succeeded
* @throws InterruptedException
*/
public List<QueryResult> queryAll(String... hashKeyValues) 
  throws InterruptedException {
  // initialize accordingly
  int timeout = 2 * 1000;
  ExecutorService executorService = Executors.newFixedThreadPool(10);

  final List<QueryResult> results = 
    new ArrayList<QueryResult>(hashKeyValues.length);
  final CountDownLatch latch = 
    new CountDownLatch(hashKeyValues.length);

  // Loop through the hash key values to "OR" in the final list of results
  for (final String hashKey : hashKeyValues) {

    executorService.submit(new Runnable() {

      @Override
      public void run() {
        try {
          // fill in parameters
          QueryResult result = dynamodb.query(new QueryRequest()
            .withTableName("MultiQueryExample")
            .addKeyConditionsEntry("City", new Condition()
              .withComparisonOperator("EQ")
            .withAttributeValueList(new AttributeValue(hashKey))));
          // one of many flavors of dealing with concurrency
          synchronized (results) { 
            results.add(result);
          }
        } catch (Throwable t) {
          // Log and handle errors
          t.printStackTrace();
        } finally {
          latch.countDown();
        }
      }
    });
  }

  // Wait for all queries to finish or time out
  latch.await(timeout, TimeUnit.MILLISECONDS);
 
  // return a copy to prevent concurrent modification of 
  // the list in the face of timeouts
  synchronized (results) {
    return new ArrayList<QueryResult>(results);
  }
}
like image 174
David Yanacek Avatar answered Jan 10 '23 04:01

David Yanacek