Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB: aggregation with redact with $in fails

I've got the following two Objects in a mongo collection named cars:

{vendor: "BMW",
model: "325i",
class: "coupe",
equipment: [
    {name: "airbag",
     cost: 120
    },
    {name: "led",
     cost: 170
    },
    {name: "abs",
     cost: 150
    }
 ]
}

{vendor: "Mercedes",
model: "C 250",
class: "coupe",
equipment: [
    {name: "airbag",
     cost: 180
    },
    {name: "esp",
     cost: 170
    },
    {name: "led",
     cost: 120
    }
 ]
}

I'm playing around with the new aggregation feature redact introduced in Mongo 2.6 I want to perform a query to get a list of cars with filtered equipment parts. In words: Give me all cars of type coupe and only the information about LED and AIRBAG equipment.

I've tried this aggregations:

db.cars.aggregate(
    [
        {$match: {"class": "coupe"}},
        {$redact:
        {
            $cond:
            {
                if: { $in : ["$name", ["airbag", "led"]]},
                then: "$$DESCEND",
                else: "$$PRUNE"
            }
        }
        }
    ]
)

But that leads to following error:

assert: command failed: { "errmsg" : "exception: invalid operator '$in'", "code" : 15999, "ok" : 0 } : aggregate failed

What I'm doing wrong? Is there another way to achieve the goal.

I would expect to get back this result from Mongo:

{vendor: "BMW",
model: "325i",
class: "coupe",
equipment: [
    {name: "airbag",
     cost: 120
    },
    {name: "led",
     cost: 170
    }
 ]
}

{vendor: "Mercedes",
model: "C 250",
class: "coupe",
equipment: [
    {name: "airbag",
     cost: 180
    },
    {name: "led",
     cost: 120
    }
 ]
}

Any suggestions on how to achieve this?

Cheers, Ralf

like image 445
Ralf Avatar asked Jul 01 '14 08:07

Ralf


People also ask

Which aggregation method is preferred for use by MongoDB?

The pipeline provides efficient data aggregation using native operations within MongoDB, and is the preferred method for data aggregation in MongoDB. The aggregation pipeline can operate on a sharded collection. The aggregation pipeline can use indexes to improve its performance during some of its stages.

Is aggregation good in MongoDB?

MongoDB Aggregation goes further though and can also perform relational-like joins, reshape documents, create new and update existing collections, and so on. While there are other methods of obtaining aggregate data in MongoDB, the aggregation framework is the recommended approach for most work.

What is redact in MongoDB?

$redact returns the fields at the current document level, excluding embedded documents. To include embedded documents and embedded documents within arrays, apply the $cond expression to the embedded documents to determine access for these embedded documents.

What passes through a MongoDB aggregation pipeline?

Each stage of the aggregation pipeline transforms the document as the documents pass through it. However, once an input document passes through a stage, it doesn't necessarily produce one output document. Some stages may generate more than one document as an output. MongoDB provides the db.


1 Answers

The $redact pipeline operator is really not the one you want for this case. What it wants to do is recursively "descend" through the document structure and evaluate the conditions at each "level" to see what actions it is going to take.

In your case, there is no "name" field at the top level of the document in order to meet the condition, so as a result the whole document is "pruned".

You are after filtering arrays, and for that case where you do not want to use $unwind and $match, then you can use the new operators such as $map:

db.cars.aggregate([
    { "$match": { "class": "coupe" }},
    { "$project": {
        "equipment": {
            "$setDifference": [
                { "$map": {
                    "input": "$equipment",
                    "as": "el",
                    "in": {
                        "$cond": [
                             { "$or": [
                                { "$eq": [ "$$el.name", "airbag" ] },
                                { "$eq": [ "$$el.name", "led" ] }
                             ]},
                             { "$cond": [ 1, "$$el", 0 ] },
                             false
                        ]
                    }
                }},
                [false]
            ]
        }
    }}
])

The $map operator works with the array and evaluates a logical condition against all of the elements, in this case within $cond. The same thing as $in which is not a logical operator in this sense is using $or, which is a logical operator for the aggregation framework.

As any item that does not meet the condition would return false, what you need to do now is "remove" all of the false values. This is aided by $setDifference which will do this by comparison.

The result is what you want:

{
    "_id" : ObjectId("53b2725120edfc7d0df2f0b1"),
    "equipment" : [
            {
                    "name" : "airbag",
                    "cost" : 120
            },
            {
                    "name" : "led",
                    "cost" : 170
            }
    ]
}
{
    "_id" : ObjectId("53b2725120edfc7d0df2f0b2"),
    "equipment" : [
            {
                    "name" : "airbag",
                    "cost" : 180
            },
            {
                    "name" : "led",
                    "cost" : 120
            }
    ]
}

If you were really intent on using $redact then there is always this contrived example:

db.cars.aggregate([
    { "$match": { "class": "coupe" }},
    { "$redact": {
        "$cond": {
            "if": {
                "$or": [
                    { "$eq": [
                       { "$ifNull": ["$name", "airbag"] },
                       "airbag"  
                    ]},
                    { "$eq": [
                       { "$ifNull": ["$name", "led"] },
                       "led"  
                    ]},
                ]
            },
            "then": "$$DESCEND",
            "else": "$$PRUNE"
        }
    }}
])

So that basically makes sure that if the field does not exist at all when descending then it will "artificially" produce one that is going to match, so that level does not get "pruned". Where the field does exist, the found value is used, and if that does not match the condition then it will be pruned.

like image 142
Neil Lunn Avatar answered Sep 28 '22 03:09

Neil Lunn