Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mongodb $aggregate empty array and multiple documents

mongodb has below document:

> db.test.find({name:{$in:["abc","abc2"]}})
{ "_id" : 1, "name" : "abc", "scores" : [ ] }
{ "_id" : 2, "name" : "abc2", "scores" : [ 10, 20 ] }

I want get scores array length for each document, how should I do?

Tried below command:

db.test.aggregate({$match:{name:"abc2"}}, {$unwind: "$scores"}, {$group: {_id:null, count:{$sum:1}}} )

Result:

{ "_id" : null, "count" : 2 }

But below command:

db.test.aggregate({$match:{name:"abc"}}, {$unwind: "$scores"}, {$group: {_id:null, count:{$sum:1}}} )

Return Nothing. Question:

  1. How should I get each lenght of scores in 2 or more document in one command?
  2. Why the result of second command return nothing? and how should I check if the array is empty?
like image 631
James Yang Avatar asked Mar 20 '15 09:03

James Yang


People also ask

Can we use $and in aggregate MongoDB?

You can use $and with aggregation but you don't have to write it, and is implicit using different filters, in fact you can pipe those filters in case one of them needs a different solution.

What is MongoDB unwind?

Definition. $unwind. Deconstructs an array field from the input documents to output a document for each element. Each output document is the input document with the value of the array field replaced by the element.

What is accumulator in MongoDB?

Accumulators are operators that maintain their state (e.g. totals, maximums, minimums, and related data) as documents progress through the pipeline. Use the $accumulator operator to execute your own JavaScript functions to implement behavior not supported by the MongoDB Query Language.


2 Answers

So this is actually a common problem. The result of the $unwind phase in an aggregation pipeline where the array is "empty" is to "remove" to document from the pipeline results.

In order to return a count of "0" for such an an "empty" array then you need to do something like the following.

In MongoDB 2.6 or greater, just use $size:

db.test.aggregate([
    { "$match": { "name": "abc" } },
    { "$group": {
       "_id": null,
       "count": { "$sum": { "$size": "$scores" } }
    }}
])

In earlier versions you need to do this:

db.test.aggregate([
    { "$match": { "name": "abc" } },
    { "$project": {
        "name": 1,
        "scores": {
            "$cond": [
                { "$eq": [ "$scores", [] ] },
                { "$const": [false] },
                "$scores"
            ]
        }
    }},
    { "$unwind": "$scores" },
    { "$group": {
        "_id": null,
        "count": { "$sum": {
            "$cond": [
                "$scores",
                1,
                0
            ]
        }}
    }}
])

The modern operation is simple since $size will just "measure" the array. In the latter case you need to "replace" the array with a single false value when it is empty to avoid $unwind "destroying" this for an "empty" statement.

So replacing with false allows the $cond "trinary" to choose whether to add 1 or 0 to the $sum of the overall statement.

That is how you get the length of "empty arrays".

like image 172
Neil Lunn Avatar answered Sep 28 '22 10:09

Neil Lunn


To get the length of scores in 2 or more documents you just need to change the _id value in the $group pipeline which contains the distinct group by key, so in this case you need to group by the document _id.

Your second aggregation returns nothing because the $match query pipeline passed a document which had an empty scores array. To check if the array is empty, your match query should be

{'scores.0': {$exists: true}} or {scores: {$not: {$size: 0}}} Overall, your aggregation should look like this:

db.test.aggregate([
    { "$match": {"scores.0": { "$exists": true } } },
    { "$unwind": "$scores" },
    {
        "$group": {
           "_id": "$_id",
           "count": { "$sum": 1 }
        }
    }
])
like image 34
chridam Avatar answered Sep 28 '22 08:09

chridam