Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DynamoDBMapper load vs query

The DynamoDBMapper provides different ways to read one item from a table:

  • query
  • load

Is there a recommendation, which of them to use? In a quick test, the following two code snippets return the same "MyEntry" item for a table with primary key=hash and range key=date, whereas the query method is roughly 10% faster.

load

public MyEntry getEntryForDay(final Integer hash, final LocalDate date) {
    return mapper.load(MyEntry.class, hash, date);
}

query

public MyEntry getEntryForDay(final Integer hash, final LocalDate date) {
    final MyEntry hashKeyValues = new MyEntry ();
    hashKeyValues.setHash(hash);
    final Condition rangeKeyCondition = new Condition()//
            .withComparisonOperator(ComparisonOperator.EQ.toString())//
            .withAttributeValueList(new AttributeValue().withS(new LocalDateMarshaller().marshall(date)));
    final DynamoDBQueryExpression<MyEntry> queryExpression = new DynamoDBQueryExpression<MyEntry>()//
            .withHashKeyValues(hashKeyValues)//
            .withRangeKeyCondition("date", rangeKeyCondition)//
            .withLimit(1);
    final List<MyEntry> storedEntries = mapper
            .query(MyEntry.class, queryExpression);
    if (storedEntries.size() == 0) {
        return null;
    }
    return storedEntries.get(0);
}
like image 671
Roland Ettinger Avatar asked Aug 06 '15 07:08

Roland Ettinger


People also ask

What is the difference between Scan and query?

While they might seem to serve a similar purpose, the difference between them is vital. While Scan is "scanning" through the whole table looking for elements matching criteria, Query is performing a direct lookup to a selected partition based on primary or secondary partition/hash key.

What is the difference between query and Scan operations in DynamoDB?

DynamoDB supports two different types of read operations, which are query and scan. A query is a lookup based on either the primary key or an index key. A scan is, as the name indicates, a read call that scans the entire table in order to find a particular result.

Can DynamoDBMapper query return null?

AWS DynamoDB Mapper query by GSI returns null for all non-key attributes.

How fast is DynamoDB query?

Problem. You might think that DynamoDB Query operation is fast, but it has its own limits. As per documentation: A single Query operation will read up to a maximum of 1 MB of data and then apply any filtering to the results using FilterExpression .


2 Answers

Load and Query are different operations:

If you have a hash key only schema, they perform the same operation - retrieve the item with the hash key specified.

If you have a hash-range schema, load retrieves a specific item identified by a single hash + range pair. Query retrieves all items that have the specified hash key and meet the range key conditions.

Since you are using the equality operator for both the hash key and range key, the operations are exactly equivalent.

like image 77
Ben Schwartz Avatar answered Oct 11 '22 09:10

Ben Schwartz


Ok, now as I'm getting more used to work with DynamoDB it turns out that a bug in the mapper.query code is causing the poorer performance:

  • The "withLimit(1)" does not in fact limit the total results returned in the list but instead the results are returned in a "PaginatedQueryList" and the actual items are lazily loaded from DB if accessed. WithLimit(1) actually limits the items loaded with each request.
  • The actual bug is the part "if (storedEntries.size() == 0)", as the size() call loads in fact ALL items in the list. With the withLimit(1) this results in the poorest possible performance.

The correct code for the mapper query is:

public MyEntry getEntryForDay(final Integer hash, final LocalDate date) {
    final MyEntry hashKeyValues = new MyEntry ();
    hashKeyValues.setHash(hash);
    final Condition rangeKeyCondition = new Condition()//
            .withComparisonOperator(ComparisonOperator.EQ.toString())//
            .withAttributeValueList(new AttributeValue().withS(new LocalDateMarshaller().marshall(date)));
    final DynamoDBQueryExpression<MyEntry> queryExpression = new DynamoDBQueryExpression<MyEntry>()//
            .withHashKeyValues(hashKeyValues)//
            .withRangeKeyCondition("date", rangeKeyCondition)//
            .withLimit(1);
    final List<MyEntry> storedEntries = mapper
            .query(MyEntry.class, queryExpression);
    if (storedEntries.isEmpty()) {
        return null;
    }
    return storedEntries.get(0);
}
like image 40
Roland Ettinger Avatar answered Oct 11 '22 11:10

Roland Ettinger