Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Query size limits in DynamoDB

I don't get the concept of limits for query/scan in DynamoDb. According to the docs:

A single Query operation can retrieve a maximum of 1 MB of data.This limit applies before any FilterExpression is applied to the results.

Let's say I have 10k items, 250kb per item, all of them fit query params.

  1. If I run a simple query, I get only 4 items?
  2. If I use ProjectionExpression to retrieve only single attribute (1kb in size), will I get 1k items?
  3. If I only need to count items (select: 'COUNT'), will it count all items (10k)?
like image 544
Dmitry Oleinik Avatar asked Feb 16 '18 09:02

Dmitry Oleinik


People also ask

Is there a limit to the size of a table in DynamoDB?

DynamoDB item size limit. The first important limit to know is the item size limit. An individual record in DynamoDB is called an item, and a single DynamoDB item cannot exceed 400KB. While 400KB is large enough for most normal database operations, it is significantly lower than the other options.

What is the maximum size of information that DynamoDB retrieves at a time?

According to the documentation an "item" can have a maximum size of 400kB which severly limits the maximum number of log elements that can be stored.

How long is DynamoDB query?

Maximum length of 255. The condition that specifies the key values for items to be retrieved by the Query action. The condition must perform an equality test on a single partition key value. The condition can optionally perform one of several comparison tests on a single sort key value.

Can DynamoDB handle big data?

DynamoDB is a key-value and document database that can support tables of virtually any size with horizontal scaling. This enables DynamoDB to scale to more than ten trillion requests per day with peaks greater than 20 million requests per second, over petabytes of storage.

What is the maximum size of a DynamoDB item?

DynamoDB item size limit The first important limit to know is the item size limit. An individual record in DynamoDB is called an item, and a single DynamoDB item cannot exceed 400KB. While 400KB is large enough for most normal database operations, it is significantly lower than the other options.

What is a query in DynamoDB?

In a Query operation, DynamoDB retrieves the items in sorted order, and then processes the items using KeyConditionExpression and any FilterExpression that might be present. Only then are the Query results sent back to the client. A Query operation always returns a result set. If no matching items are found, the result set is empty.

How do you sort data in DynamoDB?

For items with a given partition key value, DynamoDB stores these items close together, in sorted order by sort key value. In a Query operation, DynamoDB retrieves the items in sorted order, and then processes the items using KeyConditionExpression and any FilterExpression that might be present.

How do I limit the number of items a query can read?

The Query operation allows you to limit the number of items that it reads. To do this, set the Limit parameter to the maximum number of items that you want. For example, suppose that you Query a table, with a Limit value of 6, and without a filter expression.


3 Answers

If I run a simple query, I get only 4 items?

Yes

If I use ProjectionExpression to retrieve only single attribute (1kb in size), will I get 1k items?

No, filterexpressions and projectexpressions are applied after the query has completed. So you still get 4 items.

If I only need to count items (select: 'COUNT'), will it count all items (10k)?

No, still just 4

The thing that you are probably missing here is that you can still get all 10k results, or the 10k count, you just need to get the results in pages. Some details here. Basically when you complete your query, check the LastEvaluatedKey attribute, and if its not empty, get the next set of results. Repeat this until the attribute is empty and you know you have all the results.

EDIT: I should say some of the SDKs abstract this away for you. For example the Java SDK has query and queryPage, where query will go back to the server multiple times to get the full result set for you (i.e. in your case, give you the full 10k results).

like image 84
F_SO_K Avatar answered Oct 24 '22 05:10

F_SO_K


For any operation that returns items, you can request a subset of attributes to retrieve; however, doing so has no impact on the item size calculations. In addition, Query and Scan can return item counts instead of attribute values. Getting the count of items uses the same quantity of read capacity units and is subject to the same item size calculations. This is because DynamoDB has to read each item in order to increment the count.

Managing Throughput Settings on Provisioned Tables

like image 31
PhongsakornP Avatar answered Oct 24 '22 05:10

PhongsakornP


Great explanation by @f-so-k.

This is how I am handling the query.

import AWS from 'aws-sdk';

async function loopQuery(params) {
  let keepGoing = true;
  let result = null;
  while (keepGoing) {
    let newParams = params;
    if (result && result.LastEvaluatedKey) {
      newParams = {
        ...params,
        ExclusiveStartKey: result.LastEvaluatedKey,
      };
    }
    result = await AWS.query(newParams).promise();
    if (result.count > 0 || !result.LastEvaluatedKey) {
      keepGoing = false;
    }
  }
  return result;
}


const params = {
    TableName: user,
    IndexName: 'userOrder',
    KeyConditionExpression: 'un=:n',
    ExpressionAttributeValues: {
      ':n': {
        S: name,
      },
    },
    ConsistentRead: false,
    ReturnConsumedCapacity: 'NONE',
    ProjectionExpression: ALL,
  };

  const result = await loopQuery(params);

Edit:

import AWS from 'aws-sdk';

async function loopQuery(params) {
  let keepGoing = true;
  let result = null;
  let list = [];
  while (keepGoing) {
    let newParams = params;
    if (result && result.LastEvaluatedKey) {
      newParams = {
        ...params,
        ExclusiveStartKey: result.LastEvaluatedKey,
      };
    }
    result = await AWS.query(newParams).promise();
    if (result.count > 0 || !result.LastEvaluatedKey) {
      keepGoing = false;
      list = [...list, ...result]
    }
  }
  return list;
}


const params = {
    TableName: user,
    IndexName: 'userOrder',
    KeyConditionExpression: 'un=:n',
    ExpressionAttributeValues: {
      ':n': {
        S: name,
      },
    },
    ConsistentRead: false,
    ReturnConsumedCapacity: 'NONE',
    ProjectionExpression: ALL,
  };

  const result = await loopQuery(params);

like image 2
Jha Nitesh Avatar answered Oct 24 '22 05:10

Jha Nitesh