I've had a read through AWS's docs around pagination:
As their docs specify:
In a response, DynamoDB returns all the matching results within the scope of the Limit value. For example, if you issue a Query or a Scan request with a Limit value of 6 and without a filter expression, DynamoDB returns the first six items in the table that match the specified key conditions in the request (or just the first six items in the case of a Scan with no filter)
Which means that given I have a table called Questions
with an attribute called difficulty
(that can take any numeric value ranging from 0
to 2
) I might end up with the following conundrum:
GET /questions?difficulty=0&limit=3
3
to the DynamoDB query, which might return 0
items as the first 3 in the collection might not be of difficulty == 0
questions
that match that criteria without knowing I might return duplicatesHow can I then paginate based on a query correctly? Something where I'll get as many results as I asked for whilst having the correct offset
DynamoDB paginates the results from Query operations. With pagination, the Query results are divided into "pages" of data that are 1 MB in size (or less). An application can process the first page of results, then the second page, and so on.
Pagination in NodeJS is defined as adding the numbers to identify the sequential number of the pages. In pagination, we used to skip and limit for reducing the size of data in the database when they are very large in numbers.
ExclusiveStartKey. The primary key of the first item that this operation will evaluate. Use the value that was returned for LastEvaluatedKey in the previous operation. The data type for ExclusiveStartKey must be String, Number, or Binary. No set data types are allowed.
When using a command, by default the AWS CLI automatically makes multiple calls to return all possible results to create pagination. One call for each page. Disabling pagination has the AWS CLI only call once for the first page of command results.
Using async/await.
const getAllData = async (params) => {
console.log("Querying Table");
let data = await docClient.query(params).promise();
if(data['Items'].length > 0) {
allData = [...allData, ...data['Items']];
}
if (data.LastEvaluatedKey) {
params.ExclusiveStartKey = data.LastEvaluatedKey;
return await getAllData(params);
} else {
return data;
}
}
I am using a global variable allData to collect all the data.
Calling this function is enclosed within a try-catch
try {
await getAllData(params);
console.log("Processing Completed");
// console.log(allData);
} catch(error) {
console.log(error);
}
I am using this from within a Lambda and it works fine.
The article here really helped and guided me. Thanks.
Here is an example of how to iterate over a paginated result set from
a DynamoDB scan
(can be easily adapted for query
as well) in Node.js.
You could save the LastEvaluatedKey
state serverside and pass an identifier back to your client, which it would send with its next request and your server would pass that value as ExclusiveStartKey
in the next request to DynamoDB.
const AWS = require('aws-sdk');
AWS.config.logger = console;
const dynamodb = new AWS.DynamoDB({ apiVersion: '2012-08-10' });
let val = 'some value';
let params = {
TableName: "MyTable",
ExpressionAttributeValues: {
':val': {
S: val,
},
},
Limit: 1000,
FilterExpression: 'MyAttribute = :val',
// ExclusiveStartKey: thisUsersScans[someRequestParamScanID]
};
dynamodb.scan(scanParams, function scanUntilDone(err, data) {
if (err) {
console.log(err, err.stack);
} else {
// do something with data
if (data.LastEvaluatedKey) {
params.ExclusiveStartKey = data.LastEvaluatedKey;
dynamodb.scan(params, scanUntilDone);
} else {
// all results scanned. done!
someCallback();
}
}
});
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With