Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count subdocuments that match a given criteria

My Mongo collection has document of the form:

{
    ...
    "notifications": [
        {
            "enabled": true,
            "groups": [ "NG1", "NG3" ]
        },
        {
            "enabled": false,
            "groups": []
        }
    ]
}

where enabled is boolean and groups is a list of strings. I need to perform a query to identify how many entries in notifications have enabled = true and contain a given string in groups (e.g. NG3).

Previously, without the enabled property which was introduced later as requirement, my query was simply

db.collection.find({ "notifications.groups": "NG3" })

I tried a few combinations with $and operator but with no luck, so any suggestion is welcome. Thanks in advance!

like image 556
ilPittiz Avatar asked Dec 11 '22 16:12

ilPittiz


2 Answers

Would recommend running an aggregation framework pipeline that uses a combination of the $filter and $size array operators within a $project pipeline step.

The $filter operator will return an array with elements in a subset of the array that match a specified condition. The $size will simply return the number of elements in that filtered array.

So, putting this altogether, you have this pipeline to run so that you can identify how many entries in notifications have enabled = true and contain a given string in groups (e.g. "NG3"):

var pipeline = [
    { "$match": { "notifications.enabled": true, "notifications.groups": "NG3" } },
    {
        "$project": {
            "numberOfEntries": {
                "$size": {
                    "$filter": {
                        "input": "$notifications",
                        "as": "items",
                        "cond": { 
                            "$and": [
                                { "$eq": [ "$$items.enabled", true ] },
                                { "$setIsSubset": [ [ "NG3" ], "$$items.groups" ] }
                            ] 
                        }
                    }
                }
            }
        }
    }
];

db.collection.aggregate(pipeline);

The above works for MongoDB versions 3.2.X and newer. However, for a solution that covers MongoDB versions 2.6.X up to and including 3.0.X, other array operators like $map, $setDifference would be good substitute operators for filtering arrays.

Consider using the $map operator to filter the array using the same logic as above in the $cond as your mapping expression. The $setDifference operator then returns a set with elements that appear in the first set but not in the second set; i.e. performs a relative compliment of the second set relative to the first. In this case it will return the final notifications array that has elements not related to the parent documents via the enabled and groups properties.

var pipeline = [   
    { "$match": { "notifications.enabled": true, "notifications.groups": "NG3" } },
    {
        "$project": {
            "numberOfEntries": {
                "$size": {
                    "$setDifference": [
                        {
                            "$map": {
                                "input": "$notifications",
                                "as": "items",
                                "in": {
                                    "$cond": [
                                        { "$and": [
                                            { "$eq": [ "$$items.enabled", true ] },
                                            { "$setIsSubset": [ [ "NG3" ], "$$items.groups" ] }
                                        ] },
                                        "$$items",
                                        false
                                    ]
                                }
                            }
                        },
                        [false]
                    ]
                }
            }
        }
    }
];

db.collection.aggregate(pipeline);

For older MongoDB versions which do not have the above operators, consider using the $match, $unwind and $group operators to achieve the same objective:

var pipeline = [
    { "$match": { "notifications.enabled": true, "notifications.groups": "NG3" } },
    { "$unwind": "$notifications" },
    { "$match": { "notifications.enabled": true, "notifications.groups": "NG3" } },
    {
        "$group": {
            "_id": "$_id",
            "numberOfEntries": { "$sum": 1 }
        }
    }
];
db.collection.aggregate(pipeline);
like image 125
chridam Avatar answered Jan 08 '23 10:01

chridam


Count the documents

Use $elemMatch:

db.collection.find({
    "notifications": {
         "$elemMatch": {
               "enabled": true,
               "groups": "NG3"
         }
     }
})

The $elemMatch operator matches documents that contain an array field with at least one element that matches all the specified query criteria.

The problem with the following query:

db.collection.find({ "notifications.groups": "NG3", "notifications.enabled": true })

Is that the field references aren't restricted to a single notification. So, this query will match as long as one of the notifications is enabled as true and one contains an NG3 in its groups, but what you want is for both attributes to apply to the same notification. In order to restrict the matching process, you should use the $elemMatch operator.

Count notifications individually

If you want to count the number of notifications with that criteria, you should use an Aggregation Pipeline, as @chridam explained in depth in his answer. I recommend the following four stages to count the notifications:

  1. Throw everything away and just keep the notifications
  2. Expand the notifications array, generating one notification document for each array element
  3. Keep those notifications with {enabled: true, groups: "NG3"}

  4. Count the remaining notifications

Then define these four variables corresponding to each stage:

var keepNotification = {$project: {
                                      "notifications.enabled": 1, 
                                      "notifications.groups": 1, 
                                      _id: 0
                                  }
                       }

var expandNotifications = {$unwind: "$notifications"}

var filterByEnabledAndGroups = {$match: {
                                            "notifications.enabled": true,  
                                            "notifications.groups": "NG3"
                                        }
                               }

var count = {$group: {_id: "notifications", count: {$sum: 1}}}

And use them in your aggregation pipeline:

db.collection.aggregate([
                            keepNotification, 
                            expandNotifications, 
                            filterByEnabledAndGroups, 
                            count
                       ])

The final result would be something like:

{ "_id" : "notifications", "count" : 5 }
like image 34
Ali Dehghani Avatar answered Jan 08 '23 10:01

Ali Dehghani