Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ScannedCount is less than the Total number of records and result is zero in dynamoDB

I am facing a weird issue on dynamoDB AWS - I am querying my table using AWS API Gateway - AWS Service Proxy and I get Count:0 results and the ScannedCount is approx 2500 records out of total 10000 records. Just to confirm I have required data in my table for which I am using Scan Operation on dynamoDB.

What I am not able to understand is why the ScannedCount is less than the complete table records. Is this suppose to happen

like image 371
Abdeali Chandanwala Avatar asked Oct 14 '17 13:10

Abdeali Chandanwala


People also ask

What is ScannedCount DynamoDB?

According to the DynamoDB documentation, ScannedCount is the number of items dynamodb has looked through for current request and Count is the number of items matched your filter: Counting the Items in the Results.

How do you count the number of records in DynamoDB?

To get the item count of a dynamodb table, you have to: Open the AWS Dynamodb console and click on your table's name. In the Overview tab scroll down to the Items summary section. Click on the Get live item count button.

How many records can DynamoDB hold?

You can use up to 3,000 Read Capacity Units (RCUs) and up to 1,000 Write Capacity Units (WCUs) on a single partition per second. Note — this is a lot of capacity! This would allow you to read 12MB of strongly-consistent data or 24MB of eventually-consistent data per second, as well as to write 1MB of data per second.

What is limit in DynamoDB Query?

A Query operation can retrieve a maximum of 1 MB of data. This limit applies before the filter expression is evaluated.

How much data can be returned from a DynamoDB scan?

DynamoDB Scans and Queries have a limitation that only 1MB worth of data can be returned per operation. The number of records returned is dependent on the size of each individual record. Since items in dynamodb are schemaless and can vary amongst eachother, you can very easily run up against the 1MB limit in a Scan or Query.

How to count number of matching records in DynamoDB?

AFAIK from trying to figure this out for months is you can't get a count of matching records. dynamoDB will go through each record in the table or index and return those matching the filter, 1000 records at a time. You may only have 20 matching records and would get 20 as the count.

What is the use of scannedcount and Count?

It "returns the number of matching items, rather than the matching items themselves". Important, as brought up by Saumitra R. Bhave in a comment, "If the size of the Query result set is larger than 1 MB, then ScannedCount and Count will represent only a partial count of the total items.

How to get item count for a partition key in DynamoDB?

2 DynamoDB how to get items count for a partition keys using .net core? 1 AWS DynamoDB count query results without retrieving 1 Triggering AWS Lambda when a DynamoDB table grows to a certain size 1 DynamoDB get item count 0


1 Answers

According to the DynamoDB documentation, ScannedCount is the number of items dynamodb has looked through for current request and Count is the number of items matched your filter:

Counting the Items in the Results

In addition to the items that match your criteria, the Query response contains the following elements:

  • ScannedCount — the number of items that matched the key condition expression, before a filter expression (if present) was applied.
  • Count — the number of items that remain, after a filter expression (if present) was applied.**

Note

If you do not use a filter expression, then ScannedCount and Count will have the same value.

If the size of the Query result set is larger than 1 MB, then ScannedCount and Count will represent only a partial count of the total items. You will need to perform multiple Query operations in order to retrieve all of the results (see Paginating the Results).

Each Query response will contain the ScannedCount and Count for the items that were processed by that particular Query request. To obtain grand totals for all of the Query requests, you could keep a running tally of both ScannedCount and Count.

So in your case the scan went through first 2500 records (ScannedCount is 2500) and there are no results matching your filter (Count is zero).

To scan the rest of the data in the table, you need to repeat the request with pagination parameters as described here:

A single Scan will only return a result set that fits within the 1 MB size limit. To determine whether there are more results, and to retrieve them one page at a time, applications should do the following:

  • Examine the low-level Scan result:
    • If the result contains a LastEvaluatedKey element, proceed to step 2.
    • If there is not a LastEvaluatedKey in the result, then there are no more items to be retrieved.
  • Construct a new Scan request, with the same parameters as the previous one—but this time, take the LastEvaluatedKey value from step 1 and use it as the ExclusiveStartKey parameter in the new Scan request.
  • Run the new Scan request.
  • Go to step 1.

Depending on the language, you can find a library that does the pagination for you, like boto2 high-level dynamodb client for python or "paginator" in boto3.

like image 126
Boris Serebrov Avatar answered Oct 05 '22 14:10

Boris Serebrov