I have a collection with documents like this:
{ datetime: new Date(), count: 1234 }
I want to get sums of count by 24 hours
, 7 days
and 30 days
intervals.
The result should be like:
{ "sum": 100, "interval": "day" }
{ "sum": 700, "interval": "week" }
{ "sum": 3000, "interval": "month" }
In more abstract terms, I need to group results by multiple conditions (in this case — multiple time intervals)
The MySQL equivalent would be:
SELECT
IF (time>CURRENT_TIMESTAMP() - INTERVAL 24 HOUR, 1, 0) last_day,
IF (time>CURRENT_TIMESTAMP() - INTERVAL 168 HOUR, 1, 0) last_week,
IF (time>CURRENT_TIMESTAMP() - INTERVAL 720 HOUR, 1, 0) last_month,
SUM(count) count
FROM table
GROUP BY last_day,
last_week,
last_month
There are date aggregation operators available to the aggregation framework of MongoDB. So for example a $dayOfYear
operator is used to get that value from the date for use in grouping:
db.collection.aggregate([
{ "$group": {
"_id": { "$dayOfYear": "$datetime" },
"total": { "$sum": "$count" }
}}
])
Or you can use a date math approach instead. By applying the epoch date you convert the date object to a number where the math can be applied:
db.collection.aggregate([
{ "$group": {
"_id": {
"$subtract": [
{ "$subtract": [ "$datetime", new Date("1970-01-01") ] },
{ "$mod": [
{ "$subtract": [ "$datetime", new Date("1970-01-01") ] },
1000 * 60 * 60 * 24
]}
]
},
"total": { "$sum": "$count" }
}}
])
If what you are after is intervals from a current point in time then what you want is basically the date math approach and working in some conditionals via the $cond
operator:
db.collection.aggregate([
{ "$match": {
"datetime": {
"$gte": new Date(new Date().valueOf() - ( 1000 * 60 * 60 * 24 * 365 ))
}
}},
{ "$group": {
"_id": null,
"24hours": {
"$sum": {
"$cond": [
{ "$gt": [
{ "$subtract": [ "$datetime", new Date("1970-01-01") ] },
new Date().valueOf() - ( 1000 * 60 * 60 * 24 )
]},
"$count",
0
]
}
},
"30days": {
"$sum": {
"$cond": [
{ "$gt": [
{ "$subtract": [ "$datetime", new Date("1970-01-01") ] },
new Date().valueOf() - ( 1000 * 60 * 60 * 24 * 30 )
]},
"$count",
0
]
}
},
"OneYear": {
"$sum": {
"$cond": [
{ "$gt": [
{ "$subtract": [ "$datetime", new Date("1970-01-01") ] },
new Date().valueOf() - ( 1000 * 60 * 60 * 24 * 365 )
]},
"$count",
0
]
}
}
}}
])
It's essentially the same approach as the SQL example, where the query conditionally evaluates whether the date value falls within the required range and decides whether or not to add the value to the sum.
The one addition here is the additional $match
stage to restrict the query to only act on those items that would possibly be within the maximum one year range you are asking for. That makes it a bit better than the presented SQL in that an index could be used to filter those values out and you don't need to "brute force" through non matching data in the collection.
Always a good idea to restrict the input with $match
when using an aggregation pipeline.
There are two different ways to do this. One is to issue a separate count()
query for each of the ranges. This is pretty easy, and if the datetime field is indexed, it will be fast.
The second way is to combine them all into one query using a similar method as your SQL example. To do this, you need to use the aggregate()
method, creating a pipeline of $project
to create the 0 or 1 values for the new "last_day", "last_week", and "last_month" fields, and then use the $group
operator to do the sums.
Starting in Mongo 5
, it's a nice use case for the $dateDiff
operator in association with a $facet
stage:
// { date: ISODate("2021-12-04"), count: 3 } <= today
// { date: ISODate("2021-11-29"), count: 5 } <= last week
// { date: ISODate("2021-11-24"), count: 1 } <= last month
// { date: ISODate("2021-11-12"), count: 12 } <= last month
// { date: ISODate("2021-10-04"), count: 8 } <= too old
db.collection.aggregate([
{ $set: {
diff: { $dateDiff: { startDate: "$$NOW", endDate: "$date", unit: "day" } }
}},
{ $facet: {
lastMonth: [
{ $match: { diff: { $gt: -30 } } },
{ $group: { _id: null, total: { $sum: "$count" } } }
],
lastWeek: [
{ $match: { diff: { $gt: -7 } } },
{ $group: { _id: null, total: { $sum: "$count" } } }
],
lastDay: [
{ $match: { diff: { $gt: -1 } } },
{ $group: { _id: null, total: { $sum: "$count" } } }
]
}},
{ $set: {
lastMonth: { $first: "$lastMonth.total" },
lastWeek: { $first: "$lastWeek.total" },
lastDay: { $first: "$lastDay.total" }
}}
])
// { lastMonth: 21, lastWeek: 8, lastDay: 3 }
This:
first computes (with $dateDiff
) the number of days of difference between today ("$$NOW"
) and the document's date
if the date is 3 days ago, diff
will be set to -3
the intermediate result being:
{ date: ISODate("2021-12-04"), count: 3, diff: 0 }
{ date: ISODate("2021-11-29"), count: 5, diff: -5 }
{ date: ISODate("2021-11-24"), count: 1, diff: -10 }
{ date: ISODate("2021-11-12"), count: 12, diff: -22 }
{ date: ISODate("2021-10-04"), count: 8, diff: -61 }
then performs a $facet
stage that allows us to run multiple aggregation pipelines within a single stage on the same set of input documents. Each sub-pipeline has its own field in the output document where its result is stored as an array of documents.
this way, we can create a lastMonth
field that'll contain the sum of counts ($sum: "$count"
) for documents whose day diff with today is more than 30 days ({ $match: { diff: { $gt: -30 } } }
)
while we do the same for lastWeek
and lastDay
.
the intermediate result being:
{
lastMonth: [{ _id: null, total: 21 }],
lastWeek: [{ _id: null, total: 8 }],
lastDay: [{ _id: null, total: 3 }]
}
and finally cleans up the $facet
output with a $set
stage to get fields in a nice format:
{ lastMonth: 21, lastWeek: 8, lastDay: 3 }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With