Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB Aggregate $project

I store our web server logs in MongoDB and the schema looks similar to as follows:

[
  {  
    "_id" : 12345,
    "url" : "http://www.mydomain.com/xyz/abc.html",
    ....
  },
  ....
]

I am trying to use the $project operator to reshape this schema a little bit before I start passing my collection through an aggregation pipeline. Basically, I need to add a new field called "type" that will later be used to perform group-by. The logic for the new field is pretty simple.

if "url" contains "pattern_A" then set "type" = "sales lead";
else if "url" contains "pattern_B" then set "type" = "existing client";
...

I'm thinking it would have to be something like this:

db.weblog.aggregate(
  { 
    $project : {
      type : { /* how to implement the logic??? */ }
    }
  }
);

I know how to do this using map-reduce (by setting the "keyf" attribute to a custom JS function that implements the above logic) but am now trying to use the new aggregation framework to do this. I tried to implement the logic using the expression operators but so far couldn't get it to work. Any help/suggestion would be greatly appreciated!

like image 897
Edenbauer Avatar asked Nov 04 '12 22:11

Edenbauer


People also ask

What does $project do in MongoDB?

The $project takes a document that can specify the inclusion of fields, the suppression of the _id field, the addition of new fields, and the resetting of the values of existing fields. Alternatively, you may specify the exclusion of fields. Specifies the inclusion of a field.

What is difference between $Group and $project in MongoDB?

$group is used to group input documents by the specified _id expression and for each distinct grouping, outputs a document. $project is used to pass along the documents with the requested fields to the next stage in the pipeline.

Can we use $set in aggregation?

You can include one or more $set stages in an aggregation operation. To add field or fields to embedded documents (including documents in arrays) use the dot notation.

What does aggregate do in MongoDB?

What is Aggregation in MongoDB? Aggregation is a way of processing a large number of documents in a collection by means of passing them through different stages. The stages make up what is known as a pipeline. The stages in a pipeline can filter, sort, group, reshape and modify documents that pass through the pipeline.


1 Answers

I am sharing my "solution" in case others encounter the same needs like mine.

After researching for a couple of weeks, as @asya-kamsky suggested in one of his comments, I've decided to add a computed field to my original MongoDB schema. It's not ideal because whenever the logic for the computed field changes I would have to do bulk updates to update all documents in my collection but it was either that or rewrite my code to use MapReduce. I chose the former for now. In looking at MongoDB Jira board, it would appear that many people have asked for more diverse operators to be added for the $project operator and I certainly hope that the MongoDB dev team gets around to adding them sooner than later

Operator for splitting string based on a separator.

New projection operator $elemMatch

Allow $slice operator in $project

add a $inOrder operator to $project

like image 154
Edenbauer Avatar answered Sep 22 '22 02:09

Edenbauer