I want to scan a whole mongo collection and compute a custom
aggregation. I am using Node with mongoose. For scanning the whole table I was using MyModel.find({}, callback);
When I run the code, I found mongoose executes the query and gathers the desired records in an array and then simply pass that whole array to the callback. Now in a full collection scan it takes huge time.
Isn't it possibly that I get a cursor object from which I can iterate continuously mapping desired records to some callback instead of waiting for a whole lot to be gathered in array. (this is what I observed, please correct if I am wrong).
Also, can someone please advice whether doing a full collection scan for custom aggregations is the right way or not or should I look into map-reduce
or some alternative like that.
Your first option should be to use the aggregate
method instead of find
to do whatever aggregation you're looking to do. If that doesn't do what you need, look into mapReduce
, like you mentioned.
However, if you find that you do need to iterate over a large collection, you should use Mongoose's support for streaming the result of the query rather than getting it in one big array.
var stream = MyModel.find().stream();
stream.on('data', function (doc) {
// do something with the mongoose document
}).on('error', function (err) {
// handle the error
}).on('close', function () {
// the stream is closed
});
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With