I have some items
collection like this:
[
{ name: 'item1', description: 'description #1', categories: 'cat_A; cat_B'},
{ name: 'item2', description: 'description #2', categories: 'cat_B'},
{ name: 'item3', description: 'description #3', categories: 'cat_C; cat_B'},
{ name: 'item4', description: 'description #4', categories: 'cat_B; cat_A'},
{ name: 'item5', description: 'description #5', categories: 'cat_B'},
{ name: 'item6', description: 'description #6', categories: 'cat_D'}
]
I want to find and filter results by categories. I created mongo query:
db.getCollection('items')
.aggregate([
{
$match: {
categories: {$in: [/cat_a/i, /cat_b/i]}
}
}, {
$group: {
_id: "$categories",
items: { $push: { name: "$name", description: '$description' } }
}
}
])
So it returns me this:
result : [
{
"_id" : "cat_B; cat_C",
"items" : [
{
"name" : "item3",
"description" : "description #3"
}
]
}, {
"_id" : "cat_B; cat_A",
"items" : [
{
"name" : "item4",
"description" : "description #4"
}
]
}, {
"_id" : "cat_B",
"items" : [
{
"name" : "item2",
"description" : "description #2"
},
{
"name" : "item5",
"description" : "description #5"
}
]
}, {
"_id" : "cat_A; cat_B",
"items" : [
{
"name" : "item1",
"description" : "description #1"
}
]
}
]
What I want to achieve is:
result : [
{
"_id" : "cat_A",
"items" : [
{
"name" : "item1",
"description" : "description #1"
},
{
"name" : "item4",
"description" : "description #4"
}
]
}, {
"_id" : "cat_B",
"items" : [
{
"name" : "item1",
"description" : "description #1"
},
{
"name" : "item2",
"description" : "description #2"
},
{
"name" : "item3",
"description" : "description #3"
},
{
"name" : "item4",
"description" : "description #4"
},
{
"name" : "item5",
"description" : "description #5"
}
]
}
]
Is that possible in pure mongo query?
With the aggregation framework you're going to need a mechanism to split the categories
string into a distinct set but such an operator does not exist yet; the closest you will get is the substr
operator which would require knowing the index of the index position and specified number of characters for the substring to be extracted, which would be near impossible. Hence the recommendation of storing the categories as an array of distinct category names.
--EDIT--
Should you want to keep the categories
field as is then I would suggest you create an extra field that stores the list of categories, then you can run the aggregation pipeline on that field to get the desired result.
Let's use an example to demonstrate the approaches above:
Changing the schema
a) If using MongoDB v3.0 or below:
var bulk = db.items.initializeOrderedBulkOp(),
counter = 0;
db.items.find({}).forEach(doc) {
var categoriesList = doc.categories.replace(/^\s+|\s+$/g,"").split(/\s*;\s*/);
bulk.find({ "_id": doc._id })
.updateOne({
"$set": { "categoriesList": categoriesList }
});
counter++;
if (counter % 1000 == 0) {
bulk.execute();
bulk = db.items.initializeOrderedBulkOp();
}
}
if (counter % 1000 != 0 ) bulk.execute();
b) If using MongoDB v3.2.X or above:
var cursor = db.items.find({}),
bulkUpdateOps = [];
cursor.forEach(function(doc){
var categoriesList = doc.categories.replace(/^\s+|\s+$/g,"").split(/\s*;\s*/);
bulkUpdateOps.push({
"updateOne": {
"filter": { "_id": doc._id },
"update": { "$set": { "categoriesList": categoriesList } }
}
});
if (bulkUpdateOps.length == 1000) {
db.items.bulkWrite(bulkUpdateOps);
bulkUpdateOps = [];
}
});
if (bulkUpdateOps.length > 0) db.items.bulkWrite(bulkUpdateOps);
Running the aggregation on new schema
db.items.aggregate([
{ "$match": { "categoriesList": { "$in": ['cat_A', 'cat_B'] } } },
{ "$unwind": "$categoriesList" },
{
"$group": {
"_id": "$categoriesList",
"items": { "$push": { "name": "$name", "description": '$description' } }
}
}
])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With