Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB aggregate fields without knowing all the fields before hand

How can I compute the aggregate of the following metrics without knowing all the metrics before hand? Can I accomplish this using the aggregate framework or MapReduce?

[
  {
   player_id: '123',
   timestamp: <sometime>,
   metrics: {
     points_per_game: 1,
     rebounds_per_game: 2,
     assist_per_game: 3,
   }
  },
  {
    player_id: '123',
    timestamp: <sometime>,
    metrics: {
      points_per_game: 1,
      rebounds_per_game: 2,
    }
  },
  {
    player_id: '345',
    timestamp: <sometime>,
    metrics: {
      points_per_game: 1,
      rebounds_per_game: 2,
      point_in_the_paint_per_game: 2
    }
  }
]

I would like to have the following result

[
 {
   player_id: '123',
   metrics: {
     points_per_game: 2,
     rebounds_per_game: 4,
     assist_per_game: 3,
   }
 },
 {
   player_id: '345',
   metrics: {
     points_per_game: 1,
     rebounds_per_game: 2,
     point_in_the_paint_per_game: 2
   }
 }
]

I cannot do something like the following since it would require me to know every metrics:

db.stats.aggregate([
   {$group: {
     _id: {player: "$player_id"},
     points_per_game: { $sum: "$metrics.points_per_game"}
     ...
])
like image 325
eNddy Avatar asked Jun 17 '19 23:06

eNddy


People also ask

Can we use $and in aggregate MongoDB?

You can use $and with aggregation but you don't have to write it, and is implicit using different filters, in fact you can pipe those filters in case one of them needs a different solution.

What are the differences between using aggregate () and find () in MongoDB?

The Aggregation command is slower than the find command. If you access to the data like ToList() the aggregation command is faster than the find.

Which aggregate method is preferred for use by MongoDB?

The pipeline provides efficient data aggregation using native operations within MongoDB, and is the preferred method for data aggregation in MongoDB. The aggregation pipeline can operate on a sharded collection. The aggregation pipeline can use indexes to improve its performance during some of its stages.

Which is faster in find and aggregate in MongoDB?

The aggregation query takes ~80ms while the find query takes 0 or 1ms.


1 Answers

You can try below aggregation.

Convert the object into array of key value pairs followed by $unwind+$group to group by each key and accumulate the count. Final step to go back to named key value object.

db.colname.aggregate([
  {"$addFields":{"metrics":{"$objectToArray":"$metrics"}}},
  {"$unwind":"$metrics"},
  {"$group":{
    "_id":{"id":"$player_id","key":"$metrics.k"},
    "count":{"$sum":"$metrics.v"}
  }},
  {"$group":{
    "_id":"$_id.id",
    "metrics":{"$mergeObjects":{"$arrayToObject":[[["$_id.key","$count"]]]}}
  }}
])
like image 185
s7vr Avatar answered Sep 29 '22 08:09

s7vr