I have a document structure like this
[{
name: "Something",
codes: [
{type: 11},
{type: 11},
{type: 15}
]
},
{
name: "Another",
codes: [
{type: 11},
{type: 12},
{type: 15},
{type: 11}
]
}]]
I need to count how many times type = 11
appears per entry, in the collection.
I have am stumped.
Though a $match
to filter out only the documents containing the codes of a particular type can be applied, it should not be applied to this particular problem statement. Since it would filter out the documents that do not have the particular type code, from the output.
You need to:
Unwind
each document based on the codes field.project
a field wantedType
with value 1
or else with value 0
.Group
by the _id
field to get the sum of wantedType
field, which
gives you number of the wanted type codes in the particular document.0
as its count.Code:
var typeCountToCalculate = 11;
db.collection.aggregate([
{$unwind:"$codes"},
{$project:{"name":1,
"wantedType":{$cond:[{$eq:["$codes.type",typeCountToCalculate ]},1,0]}}},
{$group:{"_id":"$_id",
"name":{$first:"$name"},"count":{$sum:"$wantedType"}}}
])
o/p:
{
"_id" : ObjectId("54ad79dae024832588b287f4"),
"name" : "Another",
"count" : 2
}
{
"_id" : ObjectId("54ad79dae024832588b287f3"),
"name" : "Something",
"count" : 2
}
The aggregation framework of MongoDB is the answer here. The key operations are $unwind
for processing the array contents into "normalized" documents and the $group
pipeline stage for obtaining the counts.
There is also optimization happening with the $match
pipeline stages. Both at the beginning of the query in order to filter out documents that cannot possibly match and after the $unwind
stage, in order to remove those elements ( now documents ) that certainly do not match the conditions:
db.collection.aggregate([
// Match to filter documents
{ "$match": { "codes.type": 11 }},
// Unwind to 'de-normalize'
{ "$unwind": "$codes" },
// Match to filter again, but remove the array elements
{ "$match": { "codes.type": 11 }},
// Count the occurrences of the the matches
{ "$group": {
"_id": "$codes.type",
"count": { "$sum": 1 }
}}
])
Naturally, if you pulled out all of the "matching" then you get the "count" per "type" over the whole collection.
In modern versions you can alter this a bit with the $redact
operator as of MongoDB 2.6 and greater. If a little contrived due to the recursive nature of this pipeline stage:
db.collection.aggregate([
// Match to filter documents
{ "$match": { "codes.type": 11 }},
// Filter out non matches
{ "$redact": {
"$cond": {
"if": { "$eq": [
{ "$ifNull": [ "$type", 11 ] },
11
]},
"then": "$$DESCEND",
"else": "$$PRUNE"
}
}}
// Unwind to 'de-normalize'
{ "$unwind": "$codes" },
// Count the occurrences of the the matches
{ "$group": {
"_id": "$codes.type",
"count": { "$sum": 1 }
}}
)
It's a different way of filtering, and perfectly valid if your server supports it. Just be careful if using in other examples with nested levels.
Always filter for the matching values you want to work on "first" before doing any other operations. This removes un-necessary work from being processed in the aggregation pipeline. No use processing 100,000 documents if only 10,000 are possible matches and only 2,000 elements within those documents match as well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With