I am facing a weird issue on dynamoDB AWS - I am querying my table using AWS API Gateway - AWS Service Proxy and I get Count:0 results and the ScannedCount is approx 2500 records out of total 10000 records. Just to confirm I have required data in my table for which I am using Scan Operation on dynamoDB.
What I am not able to understand is why the ScannedCount is less than the complete table records. Is this suppose to happen
According to the DynamoDB documentation, ScannedCount is the number of items dynamodb has looked through for current request and Count is the number of items matched your filter: Counting the Items in the Results.
To get the item count of a dynamodb table, you have to: Open the AWS Dynamodb console and click on your table's name. In the Overview tab scroll down to the Items summary section. Click on the Get live item count button.
You can use up to 3,000 Read Capacity Units (RCUs) and up to 1,000 Write Capacity Units (WCUs) on a single partition per second. Note — this is a lot of capacity! This would allow you to read 12MB of strongly-consistent data or 24MB of eventually-consistent data per second, as well as to write 1MB of data per second.
A Query operation can retrieve a maximum of 1 MB of data. This limit applies before the filter expression is evaluated.
DynamoDB Scans and Queries have a limitation that only 1MB worth of data can be returned per operation. The number of records returned is dependent on the size of each individual record. Since items in dynamodb are schemaless and can vary amongst eachother, you can very easily run up against the 1MB limit in a Scan or Query.
AFAIK from trying to figure this out for months is you can't get a count of matching records. dynamoDB will go through each record in the table or index and return those matching the filter, 1000 records at a time. You may only have 20 matching records and would get 20 as the count.
It "returns the number of matching items, rather than the matching items themselves". Important, as brought up by Saumitra R. Bhave in a comment, "If the size of the Query result set is larger than 1 MB, then ScannedCount and Count will represent only a partial count of the total items.
2 DynamoDB how to get items count for a partition keys using .net core? 1 AWS DynamoDB count query results without retrieving 1 Triggering AWS Lambda when a DynamoDB table grows to a certain size 1 DynamoDB get item count 0
According to the DynamoDB documentation, ScannedCount
is the number of items dynamodb has looked through for current request and Count
is the number of items matched your filter:
Counting the Items in the Results
In addition to the items that match your criteria, the Query response contains the following elements:
- ScannedCount — the number of items that matched the key condition expression, before a filter expression (if present) was applied.
- Count — the number of items that remain, after a filter expression (if present) was applied.**
Note
If you do not use a filter expression, then ScannedCount and Count will have the same value.
If the size of the Query result set is larger than 1 MB, then ScannedCount and Count will represent only a partial count of the total items. You will need to perform multiple Query operations in order to retrieve all of the results (see Paginating the Results).
Each Query response will contain the ScannedCount and Count for the items that were processed by that particular Query request. To obtain grand totals for all of the Query requests, you could keep a running tally of both ScannedCount and Count.
So in your case the scan went through first 2500 records (ScannedCount is 2500) and there are no results matching your filter (Count is zero).
To scan the rest of the data in the table, you need to repeat the request with pagination parameters as described here:
A single Scan will only return a result set that fits within the 1 MB size limit. To determine whether there are more results, and to retrieve them one page at a time, applications should do the following:
- Examine the low-level Scan result:
- If the result contains a LastEvaluatedKey element, proceed to step 2.
- If there is not a LastEvaluatedKey in the result, then there are no more items to be retrieved.
- Construct a new Scan request, with the same parameters as the previous one—but this time, take the LastEvaluatedKey value from step 1 and use it as the ExclusiveStartKey parameter in the new Scan request.
- Run the new Scan request.
- Go to step 1.
Depending on the language, you can find a library that does the pagination for you, like boto2 high-level dynamodb client for python or "paginator" in boto3.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With