I have a document called 'InventoryPerDay' that contain inventory for stores for each day:
{
_id: "20131202/store_a",
_metadata: {
date: ISODate("2013-12-02T00:00:00Z"),
store: "store_a"
},
inventory: {
quantity: {
item_44: 1350,
item_32: 1,
item_2: 1,
item_9: 1
}
}
},
{
_id: "20131201/store_a",
_metadata: {
date: ISODate("2013-12-01T00:00:00Z"),
store: "store_a"
},
inventory: {
quantity: {
item_44: 1000,
item_32: 5,
item_2: 10
}
}
}
I need the total quantity of each item in store_a for both of the days. The items in the 'quantity' hash are unknown. You can see that 'item_9' exists for 02/12/2013 but not for 01/12/2013.
How can i sum unknown nested keys in multiple documents using aggregation in mongodb?
The result for the above example should be:
{
store: "store_a",
inventory: {
quantity: {
item_44: 2350,
item_32: 6,
item_2: 11,
item_9: 1
}
}
}
Unfortunately what you are seeking to do is not possible with the current capabilities of Mongodb (not with aggregation at least), and if you keep your current schema you will have to use map-reduce to do this which will be much slower.
You can refer here for an explanation, why your schema is not optimal.
You can refer here to understand how you should make your schema and how your aggregation should look like.
Also what you are asking would be possible with your current schema if you could project the names of the fields to values so maybe you can vote for this ticket so it gets more attention.
Update
You need to change your schema as such
{
_id: "20131202/store_a",
_metadata: {
date: ISODate("2013-12-02T00:00:00Z"),
store: "store_a"
},
inventory: {
quantities: [
{ k : "item_44", v: 1350},
{ k : "item_32", v: 1},
{ k : "item_2", v: 1},
{ k : "item_9", v: 1},
]
}
},
.
.
.
and the query should look like this
db.InventoryPerDay.aggregate(
[
{
"$unwind" : "$inventory.quantities"
},
{
"$group" : {
"_id" : { "store": "$_metadata.store", "item" : "$inventory.quantities.k"},
"total" : {
"$sum" : "$inventory.quantities.v"
}
}
}
])
This will give you a result like this
{
result :
[
{ "_id" : { store : "store_a", item : "item_44"}
"total" : 2350
},
{ "_id" : { store : "store_a", item : "item_32"}
"total" : 6
},
{ "_id" : { store : "store_a", item : "item_2"}
"total" : 11
},
{ "_id" : { store : "store_a", item : "item_9"}
"total" : 1
}
]
}
You can format these results with the $project operator at the end of the aggregation pipeline.
And regarding the map-reduce vs aggregation: Map reduce is much slower than aggregation and the main reason is because it is executed on a single thread. You can vote for this ticket so they implement it for multiple cores, in most cases, it takes orders of magnitude more time to compute with map-reduce vs aggregation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With