Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

$unwind an object in aggregation framework

In the MongoDB aggregation framework, I was hoping to use the $unwind operator on an object (ie. a JSON collection). Doesn't look like this is possible, is there a workaround? Are there plans to implement this?

For example, take the article collection from the aggregation documentation . Suppose there is an additional field "ratings" that is a map from user -> rating. Could you calculate the average rating for each user?

Other than this, I'm quite pleased with the aggregation framework.

Update: here's a simplified version of my JSON collection per request. I'm storing genomic data. I can't really make genotypes an array, because the most common lookup is to get the genotype for a random person.

variants: [      {         name: 'variant1',          genotypes: {              person1: 2,             person2: 5,             person3: 7,          }     },       {         name: 'variant2',          genotypes: {              person1: 3,             person2: 3,             person3: 2,          }     }  ] 
like image 257
Brett Thomas Avatar asked Jun 25 '12 12:06

Brett Thomas


People also ask

What does $unwind do in MongoDB?

$unwind returns a document for each element in the sizes field. In document "_id": 3 , sizes resolves to a single element array. Documents "_id": 2, "_id": 4 , and "_id": 5 do not return anything because the sizes field cannot be reduced to a single element array.

Which functionality is used for aggregation framework?

8. Which of the following functionality is used for aggregation framework? Explanation: For related projection functionality in the aggregation framework pipeline, use the $project pipeline stage.

How do you unwind more than one array?

As you can unwind more than one arrays in single aggregation pipeline. first unwind->group->count, then repeat it for another array in same pipeline. okay let me try, I will then post the query as well as the output..


2 Answers

It is not possible to do the type of computation you are describing with the aggregation framework - and it's not because there is no $unwind method for non-arrays. Even if the person:value objects were documents in an array, $unwind would not help.

The "group by" functionality (whether in MongoDB or in any relational database) is done on the value of a field or column. We group by value of field and sum/average/etc based on the value of another field.

Simple example is a variant of what you suggest, ratings field added to the example article collection, but not as a map from user to rating but as an array like this:

{ title : title of article", ...   ratings: [          { voter: "user1", score: 5 },          { voter: "user2", score: 8 },          { voter: "user3", score: 7 }   ] } 

Now you can aggregate this with:

[ {$unwind: "$ratings"},   {$group : {_id : "$ratings.voter", averageScore: {$avg:"$ratings.score"} } }  ] 

But this example structured as you describe it would look like this:

{ title : title of article", ...   ratings: {          user1: 5,          user2: 8,          user3: 7   } } 

or even this:

{ title : title of article", ...   ratings: [          { user1: 5 },          { user2: 8 },          { user3: 7 }   ] } 

Even if you could $unwind this, there is nothing to aggregate on here. Unless you know the complete list of all possible keys (users) you cannot do much with this. [*]

An analogous relational DB schema to what you have would be:

CREATE TABLE T (    user1: integer,    user2: integer,    user3: integer    ... ); 

That's not what would be done, instead we would do this:

CREATE TABLE T (    username: varchar(32),    score: integer ); 

and now we aggregate using SQL:

select username, avg(score) from T group by username;

There is an enhancement request for MongoDB that may allow you to do this in the aggregation framework in the future - the ability to project values to keys to vice versa. Meanwhile, there is always map/reduce.

[*] There is a complicated way to do this if you know all unique keys (you can find all unique keys with a method similar to this) but if you know all the keys you may as well just run a sequence of queries of the form db.articles.find({"ratings.user1":{$exists:true}},{_id:0,"ratings.user1":1}) for each userX which will return all their ratings and you can sum and average them simply enough rather than do a very complex projection the aggregation framework would require.

like image 168
Asya Kamsky Avatar answered Oct 16 '22 17:10

Asya Kamsky


Since 3.4.4, you can transform object to array using $objectToArray

See: https://docs.mongodb.com/manual/reference/operator/aggregation/objectToArray/

like image 32
Adrian Avatar answered Oct 16 '22 17:10

Adrian