Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB: Aggregation using $cond with $regex

I am trying to group data in multiple stages.

At the moment my query looks like this:

db.captions.aggregate([
{$project: {
    "videoId": "$videoId",
    "plainText": "$plainText",
    "Group1": {$cond: {if: {$eq: ["plainText", {"$regex": /leave\sa\scomment/i}]}, 
        then: "Yes", else: "No"}}}}
])

I am not sure whether it is actually possible to use the $regex operator within a $cond in the aggregation stage. I would appreciate your help very much!

Thanks in advance

like image 626
NIXes Avatar asked Apr 04 '18 19:04

NIXes


1 Answers

UPDATE: Starting with MongoDB v4.1.11, there finally appears to be a nice solution for your problem which is documented here.


Original answer:

As I wrote in the comments above, $regex does not work inside $cond as of now. There is an open JIRA ticket for that but it's, err, well, open...

In your specific case, I would tend to suggest you solve that topic on the client side unless you're dealing with crazy amounts of input data of which you will always only return small subsets. Judging by your query it would appear like you are always going to retrieve all document just bucketed into two result groups ("Yes" and "No").

If you don't want or cannot solve that topic on the client side, then here is something that uses $facet (MongoDB >= v3.4 required) - it's neither particularly fast nor overly pretty but it might help you to get started.

db.captions.aggregate([{
    $facet: { // create two stages that will be processed using the full input data set from the "captions" collection
        "CallToActionYes": [{ // the first stage will...
            $match: { // only contain documents...
                "plainText": /leave\sa\scomment/i // that are allowed by the $regex filter (which could be extended with multiple $or expressions or changed to $in/$nin which accept regular expressions, too)
            }
        }, {
            $addFields: { // for all matching documents...
                "CallToAction": "Yes" // we create a new field called "CallsToAction" which will be set to "Yes"
            }
        }],
        "CallToActionNo": [{ // similar as above except we're doing the inverse filter using $not
            $match: {
                "plainText": { $not: /leave\sa\scomment/i }
            }
        }, {
            $addFields: {
                "CallToAction": "No" // and, of course, we set the field to "No"
            }
        }]
    }
}, {
    $project: { // we got two arrays of result documents out of the previous stage
        "allDocuments" : { $setUnion: [ "$CallToActionYes", "$CallToActionNo" ] } // so let's merge them into a single one called "allDocuments"
    }
}, {
    $unwind: "$allDocuments" // flatten the "allDocuments" result array
}, {
    $replaceRoot: { // restore the original document structure by moving everything inside "allDocuments" up to the top
        newRoot: "$allDocuments"
    }
}, {
    $project: { // include only the two relevant fields in the output (and the _id)
        "videoId": 1,
        "CallToAction": 1
    }
}])

As always with the aggregation framework, it may help to remove individual stages from the end of the pipeline and run the partial query in order to get an understanding of what each individual stage does.

like image 195
dnickless Avatar answered Oct 23 '22 04:10

dnickless