In MongoDB aggregation pipeline, record flow from stage to stage happens one/batch at a time (or) will wait for the current stage to complete for whole collection before passing it to next stage?
For e.g., I have a collection classtest with following sample records
{name: "Person1", marks: 20}
{name: "Person2", marks: 20}
{name: "Person1", marks: 20}
I have total 1000 records for about 100 students and I have following aggregate query
db.classtest.aggregate(
[
{$sort: {name: 1}},
{$group: {_id: '$name',
total: {$sum: '$marks'}}},
{$limit: 5}
])
I have following questions.
My actual idea is to do pagination on results of aggregate. In above scenario, if $group maintains sort order and processes only required number of records, I want to apply $match condition {$ge: 'lastPersonName'}
in subsequent page queries.
I have solved the problem without need of maintaining another collection or even without $group traversing whole collection, hence posting my own answer.
As others have pointed:
$group
doesn't retain order, hence early sorting is not of much help.$group
doesn't do any optimization, even if there is a following $limit
, i.e., runs $group
on entire collection.My usecase has following unique features, which helped me to solve it:
I am not very particular on page size. The front-end capable of handling varying page sizes. The following is the aggregation command I have used.
db.classtest.aggregate(
[
{$sort: {name: 1}},
{$limit: 5 * 10},
{$group: {_id: '$name',
total: {$sum: '$marks'}}},
{$sort: {_id: 1}}
])
Explaining the above.
$sort
immediately precedes $limit
, the framework optimizes the amount of data to be sent to next stage. Refer here
$group
stage. With this, the size of final result may be anywhere between 0 and 50.Then name in last record (among retained results) is used as $match criteria in subsequent page request as shown below.
db.classtest.aggregate(
[
{$match: {name: {$gt: lastRecordName}}}
{$sort: {name: 1}},
{$limit: 5 * 10},
{$group: {_id: '$name',
total: {$sum: '$marks'}}},
{$sort: {_id: 1}}
])
In above, the framework will still optimize $match, $sort and $limit
together as single operation, which I have confirmed through explain plan.
pagination on group data mongodb -
in $group items you can't directly apply pagination, but below trick will be used ,
if you want pagination on group data -
for example- i want group products categoryWise and then i want only 5 product per category then
step 1 - write aggregation on product table, and write groupBY
{ $group: { _id: '$prdCategoryId', products: { $push: '$$ROOT' } } },
step 2 - prdSkip for skipping , and limit for limiting data , pass it dynamically
{
$project: {
// pagination for products
products: {
$slice: ['$products', prdSkip, prdLimit],
}
}
},
finally query looks like - params - limit , skip - for category pagination and prdSkip and PrdLimit for products pagination
db.products.aggregate([
{ $group: { _id: '$prdCategoryId', products: { $push: '$$ROOT' } } },
{
$lookup: {
from: 'categories',
localField: '_id',
foreignField: '_id',
as: 'categoryProducts',
},
},
{
$replaceRoot: {
newRoot: {
$mergeObjects: [{ $arrayElemAt: ['$categoryProducts', 0] }, '$$ROOT'],
},
},
},
{
$project: {
// pagination for products
products: {
$slice: ['$products', prdSkip, prdLimit],
},
_id: 1,
catName: 1,
catDescription: 1,
},
},
])
.limit(limit) // pagination for category
.skip(skip);
I used replaceRoot here to pullOut category.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With