I have a collection like
{
"_id" : ObjectId("5738cb363bb56eb8f76c2ba8"),
"records" : [
{
"Name" : "Joe",
"Salary" : 70000,
"Department" : "IT"
}
]
},
{
"_id" : ObjectId("5738cb363bb56eb8f76c2ba9"),
"records" : [
{
"Name" : "Henry",
"Salary" : 80000,
"Department" : "Sales"
},
{
"Name" : "Jake",
"Salary" : 40000,
"Department" : "Sales"
}
]
},
{
"_id" : ObjectId("5738cb363bb56eb8f76c2baa"),
"records" : [
{
"Name" : "Sam",
"Salary" : 90000,
"Department" : "IT"
},
{
"Name" : "Tom",
"Salary" : 50000,
"Department" : "Sales"
}
]
}
I want to have the results with the highest salary by each department
{"Name": "Sam", "Salary": 90000, "Department": "IT"}
{"Name": "Henry", "Salary": 80000, "Department": "Sales"}
I could get the highest salary. But I could not get the corresponding employee names.
db.HR.aggregate([
{ "$unwind": "$records" },
{ "$group":
{
"_id": "$records.Department",
"max_salary": { "$max": "$records.Salary" }
}
}
])
Could somebody help me?
You need to $sort
your document after $unwind
and use the $first
operator in the $group
stage. You can also use the $last
operator in which case you will need to sort your documents in ascending order
db.HR.aggregate([
{ '$unwind': '$records' },
{ '$sort': { 'records.Salary': -1 } },
{ '$group': {
'_id': '$records.Department',
'Name': { '$first': '$records.Name' } ,
'Salary': { '$first': '$records.Salary' }
}}
])
which produces:
{ "_id" : "Sales", "Name" : "Henry", "Salary" : 80000 }
{ "_id" : "IT", "Name" : "Sam", "Salary" : 90000 }
To return the maximum salary and employees list for each department you need to use the $max
in your group stage to return the maximum "Salary" for each group then use $push
accumulator operator to return a list of "Name" and "Salary" for all employees for each group. From there you need to use the $map
operator in your $project
stage to return a list of names alongside the maximum salary. Of course the $cond
here is used to compare each employee salary to the maximum value. The $setDifference
does his work which is filter out all false
and is fine as long as the data being filtered is "unique". In this case it "should" be fine, but if any two results contained the same "name" then it would skew results by considering the two to be one.
db.HR.aggregate([
{ '$unwind': '$records' },
{ '$group': {
'_id': '$records.Department',
'maxSalary': { '$max': '$records.Salary' },
'persons': {
'$push': {
'Name': '$records.Name',
'Salary': '$records.Salary'
}
}
}},
{ '$project': {
'maxSalary': 1,
'persons': {
'$setDifference': [
{ '$map': {
'input': '$persons',
'as': 'person',
'in': {
'$cond': [
{ '$eq': [ '$$person.Salary', '$maxSalary' ] },
'$$person.Name',
false
]
}
}},
[false]
]
}
}}
])
which yields:
{ "_id" : "Sales", "maxSalary" : 80000, "persons" : [ "Henry" ] }
{ "_id" : "IT", "maxSalary" : 90000, "persons" : [ "Sam" ] }
Its not the most intuitive thing, but instead of $max
you should be using $sort
and $first
:
{ "$unwind": "$records" },
{ "$sort": { "$records.Salary": -1},
{ "$group" :
{
"_id": "$records.Department",
"max_salary": { "$first": "$records.Salary" },
"name": {$first: "$records.Name"}
}
}
Alternatively, I think this is doable using the $$ROOT
operator (fair warning: I've not actually tried this) -
{ "$unwind": "$records" },
{ "$group":
{
"_id": "$records.Department",
"max_salary": { "$max": "$records.Salary" }
"name" : "$$ROOT.records.Name"
}
}
}
Another possible solution:
db.HR.aggregate([
{"$unwind": "$records"},
{"$group":{
"_id": "$records.Department",
"arr": {"$push": {"Name":"$records.Name", "Salary":"$records.Salary"}},
"maxSalary": {"$max":"$records.Salary"}
}},
{"$unwind": "$arr"},
{"$project": {
"_id":1,
"arr":1,
"isMax":{"$eq":["$arr.Salary", "$maxSalary"]}
}},
{"$match":{
"isMax":true
}}
])
This solution takes advantage of the $eq operator to compare two fields in the $project stage.
Test case:
db.HR.insert({"records": [{"Name": "Joe", "Salary": 70000, "Department": "IT"}]})
db.HR.insert({"records": [{"Name": "Henry", "Salary": 80000, "Department": "Sales"}, {"Name": "Jake", "Salary": 40000, "Department": "Sales"}, {"Name": "Santa", "Salary": 90000, "Department": "IT"}]})
db.HR.insert({"records": [{"Name": "Sam", "Salary": 90000, "Department": "IT"}, {"Name": "Tom", "Salary": 50000, "Department": "Sales"}]})
Result:
{ "_id" : "Sales", "arr" : { "Name" : "Henry", "Salary" : 80000 }, "isMax" : true }
{ "_id" : "IT", "arr" : { "Name" : "Santa", "Salary" : 90000 }, "isMax" : true }
{ "_id" : "IT", "arr" : { "Name" : "Sam", "Salary" : 90000 }, "isMax" : true }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With