When we scan a DynamoDB table, we can/should use LastEvaluatedKey to track the progress so that we can resume in case of failures. The documentation says that 
LastEvaluateKey is The primary key of the item where the operation stopped, inclusive of the previous result set. Use this value to start a new operation, excluding this value in the new request.
My question is if I start a scan, pause, insert a few rows and resume the scan from the previous LastEvaluatedKey, will I get those new rows after resuming the scan?
My guess is I might miss some of all of the new rows because the new keys will be hashed and the values could be smaller than LastEvaluatedKey.
Is my guess right? Any explanation or documentation links are appreciated.
It is going sequentially through your data, and it does not know about all items that were added in the process:
Scan operations proceed sequentially; however, for faster performance on a large table or secondary index, applications can request a parallel Scan operation by providing the Segment and TotalSegments parameters.
Not only it can miss some of the items that were added after you've started scanning it can also miss some of the items that were added before the scan started if you are using eventually consistent read:
Scan uses eventually consistent reads when accessing the data in a table; therefore, the result set might not include the changes to data in the table immediately before the operation began.
If you need to keep track of items that were added after you've started a scan you can use DynamoDB streams for that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With