Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mongodb aggregate by date with empty daybins

I am trying to do a per-day aggregation in MongoDB. I already have an aggregation where I successfully group the data by day. However, I want to do the aggregation in such a way where days with no data show up, but empty. That is, they are empty bins.

Below is what I have so far. I have not been able to find anything in the MongoDB documentation or otherwise that suggests how to do aggregations and produce empty bins:

app.models.profile_view.aggregate(
    { $match: { user: req.user._id , 'viewing._type': 'user' } },
    { $project: { 
        day: {'$dayOfMonth': '$start'},month: {'$month':'$start'},year: {'$year':'$start'},
        duration: '$duration'
    } },
    { $group: {
        _id: { day:'$day', month:'$month', year:'$year' },
        count: { $sum: 1 },
        avg_duration: { $avg: '$duration' }
    } },
    { $project: { _id: 0, date: '$_id', count: 1, avg_duration: 1 }}
).exec().then(function(time_series) {
    console.log(time_series)
    return res.send(200, [{ key: 'user', values: time_series }])
}, function(err) {
    console.log(err.stack)
    return res.send(500, { error: err, code: 200, message: 'Failed to retrieve profile view data' })
})
like image 593
user2205763 Avatar asked Aug 01 '14 11:08

user2205763


2 Answers

I don't think you will be able to solve this problem using aggregation. When you use $group, mongo can only group based on the data you are providing it. In this case, how would mongo know which date values are missing or even what the range of acceptable dates is?

I think your best option would be to add the missing date values to the result of your aggregation.

like image 108
bkan Avatar answered Oct 17 '22 15:10

bkan


Starting in Mongo 5.1, it's a perfect use case for the new $densify aggregation operator:

// { date: ISODate("2021-12-05") }
// { date: ISODate("2021-12-05") }
// { date: ISODate("2021-12-03") }
// { date: ISODate("2021-12-07") }
db.collection.aggregate([

  { $group: {
    _id: { $dateTrunc: { date: "$date", unit: "day" } },
    total: { $count: {} }
  }},
  // { _id: ISODate("2021-12-03"), total: 1 }
  // { _id: ISODate("2021-12-05"), total: 2 }
  // { _id: ISODate("2021-12-07"), total: 1 }

  { $densify: { field: "_id", range: { step: 1, unit: "day", bounds: "full" } } },
  // { _id: ISODate("2021-12-03"), total: 1 }
  // { _id: ISODate("2021-12-04") }
  // { _id: ISODate("2021-12-05"), total: 2 }
  // { _id: ISODate("2021-12-06") }
  // { _id: ISODate("2021-12-07"), total: 1 }

  { $project: {
    day: "$_id",
    _id: 0,
    total: { $cond: [ { $not: ["$total"] }, 0, "$total" ] }
  }}
])
// { day: ISODate("2021-12-03"), total: 1 }
// { day: ISODate("2021-12-04"), total: 0 }
// { day: ISODate("2021-12-05"), total: 2 }
// { day: ISODate("2021-12-06"), total: 0 }
// { day: ISODate("2021-12-07"), total: 1 }

This:

  • $groups documents by day with their $count
    • $dateTrunc truncates your dates at the beginning of their day (the truncation unit).
  • $densifies documents ($densify) by creating new documents in a sequence of documents where certain values for a field (in our case field: "_id") are missing:
    • the step for our densification is 1 day: range: { step: 1, unit: "day" }
  • finally transforms ($project) fields:
    • renames _id to day
    • add the total field for new documents included during the densify stage ({ views: { $cond: [ { $not: ["$views"] }, 0, "$views" ] })
like image 21
Xavier Guihot Avatar answered Oct 17 '22 15:10

Xavier Guihot