Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mongodb node.js $out with aggregation only working if calling toArray()

Saving an aggregation query using "mongodb": "^3.0.6" as result with the $out operator is only working when calling .toArray().

The aggregation step(s):

let aggregationSteps = [{
    $group: {
        _id: '$created_at',
    }
}, {'$out': 'ProjectsByCreated'}];

Executing the aggregation:

await collection.aggregate(aggregationSteps, {'allowDiskUse': true})

Expected result: New collection called ProjectsByCreated.

Result: No collection, query does not throw an exception but is not being executed? (takes only 1ms)

Appending toArray() results in the expected behaviour:

await collection.aggregate(aggregationSteps, {'allowDiskUse': true}).toArray();

Why does mongodb only create the result collection when calling .toArray() and where does the documentation tell so? How can I fix this?

The documentation doesn't seem to provide any information about this:

  • https://mongodb.github.io/node-mongodb-native/3.0/api/Collection.html#aggregate
  • https://docs.mongodb.com/manual/reference/operator/aggregation/out/index.html
like image 676
Matthias Herrmann Avatar asked Apr 14 '18 19:04

Matthias Herrmann


1 Answers

MongoDB acknowledge this behaviour, but they also say this is working as designed.

It has been logged as a bug in the MongoDB JIRA, $out aggregation stage doesn't take effect, and the responses say it is not a fault:

This behavior is intentional and has not changed in some time with the node driver. When you "run" an aggregation by calling Collection.prototype.aggregate, we create an intermediary Cursor which is not executed until some sort of I/O is requested. This allows us to provide the chainable cursor API (e.g. cursor.limit(..).sort(..).project(..)), building up the find or aggregate options in a builder before executing the initial query.

... Chaining toArray in order to execute the out stage doesn't feel quite right. Is there something more natural that I haven't noticed?

Unfortunately not, the chained out method there simply continues to build your aggregation. Any of the following methods will cause the initial aggregation to be run: toArray, each, forEach, hasNext, next. We had considered adding something like exec/run for something like change streams, however it's still in our backlog. For now you could theoretically just call hasNext which should run the first aggregation and retrieve the first batch (this is likely what exec/run would do internally anyway).

So, it looks like you do have to call one of the methods to start iterating the cursor before $out will do anything. Adding .toArray(), as you're already doing, is probably safest. Note that to.Array() does not load the entire result into RAM as normal; because it includes a $out, the aggregation returns an empty cursor.

like image 106
Vince Bowdren Avatar answered Nov 15 '22 03:11

Vince Bowdren