Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Group result by 15 minutes time interval in MongoDb

I have a "status" collection like this strcture -

{     _id: ObjectId("545a0b63b03dbcd1238b4567"),     status: 1004,     comment: "Rem dolor ipsam placeat omnis non. Aspernatur nobis qui nisi similique.",     created_at: ISODate("2014-11-05T11:34:59.804Z") }, {     _id: ObjectId("545a0b66b03dbcd1238b4568"),     status: 1001,     comment: "Sint et eos vero ipsa voluptatem harum. Hic unde voluptatibus et blanditiis quod modi.",     created_at: ISODate("2014-11-05T11:35:02.814Z") } .... .... 

I need to get result grouped by 15 minutes interval from that collection.

like image 396
Hein Zaw Htet Avatar asked Nov 08 '14 06:11

Hein Zaw Htet


People also ask

Can we use group by in MongoDB?

MongoDB group by is used to group data from the collection, we can achieve group by clause using aggregate function and group method in MongoDB. While using aggregate function with group by clause query operations is faster as normal query, basically aggregate function is used in multiple condition.

Can we use multiple group in MongoDB?

Yes the listings have more than two $group stages, the heavy lifting is actually done in two groupings with the others just there for array manipulation if you require it, but it gives you exact and ordered results.

How fast is MongoDB aggregate?

What is really interesting is how fast MongoDB managed to mass all this data at a rate of 387440 documents per second. Being excited about this result, let's now check how fast we can randomly select a one-hour report.

How do you multiply in MongoDB?

Pass the arguments to $multiply in an array. The $multiply expression has the following syntax: { $multiply: [ <expression1>, <expression2>, ... ] } The arguments can be any valid expression as long as they resolve to numbers.


2 Answers

There are a couple of ways to do this.

The first is with Date Aggregation Operators, which allow you to dissect the "date" values in documents. Specifically for "grouping" as the primary intent:

db.collection.aggregate([   { "$group": {     "_id": {       "year": { "$year": "$created_at" },       "dayOfYear": { "$dayOfYear": "$created_at" },       "hour": { "$hour": "$created_at" },       "interval": {         "$subtract": [            { "$minute": "$created_at" },           { "$mod": [{ "$minute": "$created_at"}, 15] }         ]       }     }},     "count": { "$sum": 1 }   }} ]) 

The second way is by using a little trick of when a date object is subtracted (or other direct math operation) from another date object, then the result is a numeric value representing the epoch timestamp milliseconds between the two objects. So just using the epoch date you get the epoch milliseconds representation. Then use date math for the interval:

db.collection.aggregate([     { "$group": {         "_id": {             "$subtract": [                 { "$subtract": [ "$created_at", new Date("1970-01-01") ] },                 { "$mod": [                      { "$subtract": [ "$created_at", new Date("1970-01-01") ] },                     1000 * 60 * 15                 ]}             ]         },         "count": { "$sum": 1 }     }} ]) 

So it depends on what kind of output format you want for the grouping interval. Both basically represent the same thing and have sufficient data to re-construct as a "date" object in your code.

You can put anything else you want in the "grouping operator" portion after the grouping _id. I'm just using the basic "count" example in lieu of any real statement from yourself as to what you really want to do.


MongoDB 4.x and Upwards

There were some additions to Date Aggregation Operators since the original writing, but from MongoDB 4.0 there will be actual "real casting of types" as opposed to the basic math tricks done here with BSON Date conversion.

For instance we can use $toLong and $toDate as new helpers here:

db.collection.aggregate([   { "$group": {     "_id": {       "$toDate": {         "$subtract": [           { "$toLong": "$created_at" },           { "$mod": [ { "$toLong": "$created_at" }, 1000 * 60 * 15 ] }         ]       }     },     "count": { "$sum": 1 }   }} ]) 

That's a bit shorter and does not require defining an external BSON Date for the "epoch" value as a constant in defining the pipeline so it's pretty consistent for all language implementations.

Those are just two of the "helper" methods for type conversion which all tie back to the $convert method, which is a "longer" form of the implementation allowing for custom handling on null or error in conversion.

It's even possible with such casting to get the Date information from the ObjectId of the primary key, as this would be a reliable source of "creation" date:

db.collection.aggregate([   { "$group": {     "_id": {       "$toDate": {         "$subtract": [           { "$toLong": { "$toDate": "$_id" }  },           { "$mod": [ { "$toLong": { "$toDate": "$_id" } }, 1000 * 60 * 15 ] }         ]       }     },     "count": { "$sum": 1 }   }} ]) 

So "casting types" with this sort of conversion can be pretty powerful tool.

Warning - ObjectId values are limited to precision to the second only for the internal time value that makes up part of their data allowing the $toDate conversion. The actual inserted "time" is most probably dependent on the driver in use. Where precision is required, it's still recommended to use a discrete BSON Date field instead of relying on ObjectId values.

like image 60
Neil Lunn Avatar answered Sep 17 '22 18:09

Neil Lunn


I like the other answer here, and mostly for the use of date math instead of aggregation date operators which while helpful can also be a little obscure.

The only thing I want to add here is that you can also return a Date object from the aggregation framework by this approach as opposed to the "numeric" timestamp as the result. It's just a little extra math on the same principles, using $add:

db.collection.aggregate([     { "$group": {         "_id": {             "$add": [                 { "$subtract": [                     { "$subtract": [ "$current_date", new Date(0) ] },                     { "$mod": [                          { "$subtract": [ "$current_date", new Date(0) ] },                         1000 * 60 * 15                     ]}                 ] },                 new Date(0)             ]         },         "count": { "$sum": 1 }     }} ]) 

The Date(0) contructs in JavaScript here represent the same "epoch" date in a shorter form, as 0 millisecond from epoch is epoch. But the main point is that when the "addition" to another BSON date object is done with a numeric identifier, then the inverse of the described condition is true and the end result is actually now a Date.

All drivers will return the native Date type to their language by this approach.

like image 30
Blakes Seven Avatar answered Sep 16 '22 18:09

Blakes Seven