Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB - get documents with max attribute per group in a collection

My data looks like this:

session, age, firstName, lastName
1, 28, John, Doe
1, 21, Donna, Keren
2, 32, Jenna, Haze
2, 52, Tommy, Lee
..
..

I'd like to get all the rows which are the largest (by age) per session. So So for the above input my output would look like:

sessionid, age, firstName, lastName
1, 28, John, Doe
2, 52, Tommy, Lee

because John has the largest age in the session = 1 group and Tommy has the largest age on the session=2 group.

I need to export the result to a file (csv) and it may contain lots of records.

How can I achieve this?

like image 589
orcaman Avatar asked Sep 17 '14 14:09

orcaman


1 Answers

MongoDB aggregation offers the $max operator, but in your case you want the "whole" record as it is. So the appropriate thing to do here is $sort and then use the $first operator within a $group statement:

db.collection.aggregate([
    { "$sort": { "session": 1, "age": -1 } },
    { "$group": {
        "_id": "$session",
        "age": { "$first": "$age" },
        "firstName": { "$first" "$firstName" },
        "lastName": { "$first": "$lastName" }
    }}
])

So the "sorting" gets the order right, and the "grouping" picks the first occurrence within the "grouping" key where those fields exist.

Mostly $first here because the $sort is done in reverse order. You can also use $last when in an ascending order as well.

like image 91
Neil Lunn Avatar answered Oct 08 '22 10:10

Neil Lunn