Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB aggregate selective project

I am running into problems reshaping my document during aggregation for grouping. Basically I want to push entries to fields depending in their type. I have a structure as follows:

_id: P1
   entities: [{type: A, val: X}, {type: B, val: X}, {type: A, val: Y}]
   ...

I want to $unwind and $project these entities so that get a structure like:

_id: P1
   A: [X]
   B: []
_id: P1
   A: [Y]
   B: []
_id: P1
   A: []
   B: [X]

so i can perform a grouping by A or B or both, i.e.

$group: {
   _id: {
       A: $A, 
       B: $B
    }
    count: {$sum : 1}

I thought I could simply do:

$unwind: $entities
$project: {
   id: §id
   A: {"$cond":[{"$eq":["$type","A"]},"$code"]}
   B: {"$cond":[{"$eq":["$type","B"]},"$code"]}
}
$group: {
  _id: "$id"
  A: {$addToSet : "$A"}
}

or failing that something like

$unwind: $entities
$group: {
  _id: "$id"
  A: {"$cond":[{"$eq":["$type","A"]},$push: "$code", null]}
  ...
}

but both versions fail because I cannot do nothing on else and I did not manage to use $push inside a conditional. The closest I got is project depending on the type, but since I could not find a way not to add anything to the field when there was no match, I end up with:

_id: P1
   A: [X,null,Y]
   B: [null,X,null]

which messes up the counting. My second idea was to filter the arrays to remove null. But I did not find a way to remove entities, because again $cond wouldnt let me specify an empty/"do nothing" else case.

I have a feeling it could work with grouping by type and content with matching of the required types, but because I have many types and arbitrary groupings resulting in a grouping tree, this might become very complicated. Ideas or hints to mistakes would be very welcome.

Thank you

EDIT: The solution based on the accepted anwser

I had to slightly adapt it, to filter cases where all content of a type was null, because otherwise it would have been lost during matching and because I want to keep that knowledge. Thanks!

{$project:{
  A: {$cond: [
      {$eq: ["$A", [false]]},
      ["N/A"],
      "$A"
  ]},
  B: {$cond: [
      {$eq: ["$B", [false]]},
      ["N/A"],
      "$B"
  ]},
}},
{ "$unwind": "$A" },
{ "$match": { "A": { "$ne": false } } },
{ "$group": {
    "_id": "$_id",
    "A": { "$push": "$A" },
    "B": { "$first": "$B" }
}},
{ "$unwind": "$B" },
{ "$match": { "B": { "$ne": false } } },
{ "$group": {
    "_id": "$_id",
    "A": { "$first": "$A" },
    "B": { "$push": "$B" }
}}
like image 244
user1437515 Avatar asked Jun 30 '14 18:06

user1437515


People also ask

How do I project specific fields in MongoDB?

You can select a single field in MongoDB using the following syntax: db. yourCollectionName. find({"yourFieldName":yourValue},{"yourSingleFieldName":1,_id:0});

What is Project in MongoDB aggregation?

The $project takes a document that can specify the inclusion of fields, the suppression of the _id field, the addition of new fields, and the resetting of the values of existing fields. Alternatively, you may specify the exclusion of fields.

How can you group by a particular value in MongoDB?

We can group by single as well as multiple field from the collection, we can use $group operator in MongoDB to group fields from the collection and returns the new document as result. We are using $avg, $sum, $max, $min, $push, $last, $first and $addToSet operator with group by in MongoDB.

Is MongoDB good for aggregate?

As with many other database systems, MongoDB allows you to perform a variety of aggregation operations. These allow you to process data records in a variety of ways, such as grouping data, sorting data into a specific order, or restructuring returned documents, as well as filtering data as one might with a query.


1 Answers

You seemed on the right track, there are just different approaches to removing those values of false from the conditional. You cannot have it return nothing, but you cn get rid of the values you do not want.

If you truly want "sets" and you have MongoDB 2.6 or greater available, then you basically filter out the false values using $setDifference:

db.entities.aggregate([
    { "$unwind": "$entities" },
    { "$group": {
        "_id": "$_id",
        "A": { 
            "$addToSet": {
                "$cond": [
                    { "$eq": [ "$entities.type", "A" ] },
                    "$entities.val",
                    false
                ]
            }
        },
        "B": { 
            "$addToSet": {
                "$cond": [
                    { "$eq": [ "$entities.type", "B" ] },
                    "$entities.val",
                    false
                ]
            }
        }
    }},
    { "$project": {
        "A": {
            "$setDifference": [ "$A", [false] ]
        },
        "B": {
            "$setDifference": [ "$B", [false] ]
        }
    }}
])

Or just as one step using the $map operator inside $project:

db.entities.aggregate([
    {"$project": {
        "A": {
             "$setDifference": [
                 {
                     "$map": {
                         "input": "$entities",
                         "as": "el",
                         "in": {
                             "$cond": [
                                 { "$eq": [ "$$el.type", "A" ] },
                                 "$$el.val",
                                 false
                             ]
                         }
                     }
                 },
                 [false]
             ]
         },
        "B": {
             "$setDifference": [
                 {
                     "$map": {
                         "input": "$entities",
                         "as": "el",
                         "in": {
                             "$cond": [
                                 { "$eq": [ "$$el.type", "B" ] },
                                 "$$el.val",
                                 false
                             ]
                         }
                     }
                 },
                 [false]
             ]
         }
    }}
])

Or otherwise stay with general $unwind and $match operators to filter these:

db.entities.aggregate([
    { "$unwind": "$entities" },
    { "$group": {
        "_id": "$_id",
        "A": { 
            "$push": {
                "$cond": [
                    { "$eq": [ "$entities.type", "A" ] },
                    "$entities.val",
                    false
                ]
            }
        },
        "B": { 
            "$push": {
                "$cond": [
                    { "$eq": [ "$entities.type", "B" ] },
                    "$entities.val",
                    false
                ]
            }
        }
    }},
    { "$unwind": "$A" },
    { "$match": { "A": { "$ne": false } } },
    { "$group": {
        "_id": "$_id",
        "A": { "$push": "$A" },
        "B": { "$first": "$B" }
    }},
    { "$unwind": "$B" },
    { "$match": { "B": { "$ne": false } } },
    { "$group": {
        "_id": "$_id",
        "A": { "$first": "$A" },
        "B": { "$push": "$B" }
    }}
])

Using either $push for normal arrays or $addToSet for unique sets.

like image 97
Neil Lunn Avatar answered Sep 27 '22 19:09

Neil Lunn