Is it possible to merge array fields in while using MongoDB aggregation framework? Here is a summary problem I am trying to solve:
Sample input documents for aggregation:
{
"Category" : 1,
"Messages" : ["Msg1", "Msg2"],
"Value" : 1
},
{
"Category" : 1,
"Messages" : [],
"Value" : 10
},
{
"Category" : 1,
"Messages" : ["Msg1", "Msg3"],
"Value" : 100
},
{
"Category" : 2,
"Messages" : ["Msg4"],
"Value" : 1000
},
{
"Category" : 2,
"Messages" : ["Msg5"],
"Value" : 10000
},
{
"Category" : 3,
"Messages" : [],
"Value" : 100000
}
We want to group by 'Category' while summing up 'Value' and merging 'Messages'. I have tried this aggregation pipeline:
{group : {
_id : "$Category",
Value : { $sum : "$Value"},
Messages : {$push : "$Messages"}
}
},
{$unwind : "$Messages"},
{$unwind : "$Messages"},
{$group : {
_id : "$_id",
Value : {$first : "$Value"},
Messages : {$addToSet : "$Messages"}
}
}
The result is:
"result" : [{
"_id" : 1,
"Value" : 111,
"Messages" : ["Msg3", "Msg2", "Msg1"]
},
{
"_id" : 2,
"Value" : 11000,
"Messages" : ["Msg5", "Msg4"]
}
]
However, this completely misses Category 3 since the documents where 'Category' is 3 do not have any 'Messages' and they are dropped by the second unwind. We would like the result to include the following as well:
{
"_id" : 3,
"Value" : 100000,
"Messages" : []
}
Is there a neat way of achieving this by the aggregation framework?
For performing MongoDB Join two collections, you must use the $lookup operator. It is defined as a stage that executes a left outer join with another collection and aids in filtering data from joined documents. For example, if a user requires all grades from all students, then the below query can be written: Students.
To update the table with the new schema, you need to leverage your updates with the bulkWrite() API which is more efficient for the task. Consider the following bulk update operation where you just iterate using the find() cursor and update the fields as: var ops = []; db.
aggregate() method always returns Objects no matter what you do and that cannot change. However, that does not mean you cannot put them in an array and return the array in an object.
Here is a trick you can use if Messages is guaranteed to be an array:
> db.messages.find()
{ "Category" : 1, "Messages" : [ "Msg1", "Msg2" ], "Value" : 1 }
{ "Category" : 1, "Messages" : [ ], "Value" : 10 }
{ "Category" : 1, "Messages" : [ "Msg1", "Msg3" ], "Value" : 100 }
{ "Category" : 2, "Messages" : [ "Msg4" ], "Value" : 1000 }
{ "Category" : 2, "Messages" : [ "Msg5" ], "Value" : 10000 }
{ "Category" : 3, "Messages" : [ ], "Value" : 100000 }
> var group1 = {
"$group": {
"_id": "$Category",
"Value": {
"$sum": "$Value"
},
"Messages": {
"$push": "$Messages"
}
}
};
> var project1 = {
"$project": {
"Value": 1,
"Messages": {
"$cond": [
{
"$eq": [
"$Messages",
[ [ ] ]
]
},
[ [ null ] ],
"$Messages"
]
}
}
};
> db.messages.aggregate( group1, project1 )
{ "_id" : 3, "Value" : 100000, "Messages" : [ [ null ] ] }
{ "_id" : 2, "Value" : 11000, "Messages" : [ [ "Msg4" ], [ "Msg5" ] ] }
{ "_id" : 1, "Value" : 111, "Messages" : [ [ "Msg1", "Msg2" ], [ ], [ "Msg1", "Msg3" ] ] }
Now unwind twice and re-group to get a single Messages array.
> var unwind = {"$unwind":"$Messages"};
> var group2 = {
$group: {
"_id": "$_id",
"Value": {
"$first": "$Value"
},
"Messages": {
"$addToSet": "$Messages"
}
}
};
> var project2 = {
"$project": {
"Category": "$_id",
"_id": 0,
"Value": 1,
"Messages": {
"$cond": [
{
"$eq": [
"$Messages",
[ null ]
]
},
[ ],
"$Messages"
]
}
}
};
> db.messages.aggregate(group1, project1, unwind, unwind, group2 ,project2 )
{ "Value" : 111, "Messages" : [ "Msg3", "Msg2", "Msg1" ], "Category" : 1 }
{ "Value" : 11000, "Messages" : [ "Msg5", "Msg4" ], "Category" : 2 }
{ "Value" : 100000, "Messages" : [ ], "Category" : 3 }
As already mentioned in one of the comments, the simplest answer to the original question is to add preserveNullAndEmptyArrays to the $unwind stage.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With