Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDb aggregation Group by Date

I'm trying to group by timestamp for the collection named "foo" { _id, TimeStamp }

db.foos.aggregate(
[
   {$group : { _id : new Date (Date.UTC({ $year : '$TimeStamp' },{ $month : '$TimeStamp' },{$dayOfMonth : '$TimeStamp'}))       }}
])

Expecting many dates but the result is just one date. The data i'm using is correct (has many foo and different dates except 1970). There's some problem in the date parsing but i can not solve yet.

{
    "result" : [ 
        {
            "_id" : ISODate("1970-01-01T00:00:00.000Z")
        }
    ],
    "ok" : 1
}

Tried this One:

db.foos.aggregate(
[
   {$group : { _id : { year : { $year : '$TimeStamp' }, month : { $month : '$TimeStamp' }, day : {$dayOfMonth : '$TimeStamp'} }, count : { $sum : 1 }       }},
   {$project : { parsedDate : new Date('$_id.year', '$_id.month', '$_id.day') , count : 1, _id : 0} }
])

Result :

uncaught exception: aggregate failed: {
    "errmsg" : "exception: disallowed field type Date in object expression (at 'parsedDate')",
    "code" : 15992,
    "ok" : 0
}

And that one:

db.foos.aggregate(
[
   {$group : { _id : { year : { $year : '$TimeStamp' }, month : { $month : '$TimeStamp' }, day : {$dayOfMonth : '$TimeStamp'} }, count : { $sum : 1 }       }},
   {$project : { parsedDate : Date.UTC('$_id.year', '$_id.month', '$_id.day') , count : 1, _id : 0} }
])

Can not see dates in the result

{
    "result" : [ 
        {
            "count" : 412
        }, 
        {
            "count" : 1702
        }, 
        {
            "count" : 422
        }
    ],
    "ok" : 1
}
like image 455
jack-london Avatar asked Nov 13 '13 09:11

jack-london


2 Answers

db.foos.aggregate(
    [   
        {   $project : { day : {$substr: ["$TimeStamp", 0, 10] }}},        
        {   $group   : { _id : "$day",  number : { $sum : 1 }}},
        {   $sort    : { _id : 1 }}        
    ]
)

Group by date can be done in two steps in the aggregation framework, an additional third step is needed for sorting the result, if sorting is desired:

  1. $project in combination with $substr takes the first 10 characters (YYYY:MM:DD) of the ISODate object from each document (the result is a collection of documents with the fields "_id" and "day");
  2. $group groups by day, adding (summing) the number 1 for each matching document;
  3. $sort ascending by "_id", which is the day from the previous aggregation step - this is optional if sorted result is desired.

This solution can not take advantage of indexes like db.twitter.ensureIndex( { TimeStamp: 1 } ), because it transforms the ISODate object to a string object on the fly. For large collections (millions of documents) this could be a performance bottleneck and more sophisticated approaches should be used.

like image 191
petra Avatar answered Oct 26 '22 19:10

petra


It depends on whether you want to have the date as ISODate type in the final output. If so, then you can do one of two things:

  1. Extract $year, $month, $dayOfMonth from your timestamp and then reconstruct a new date out of them (you are already trying to do that, but you're using syntax that doesn't work in aggregation framework).

  2. If the original Timestamp is of type ISODate() then you can do date arithmetic to subtract the hours, minutes, seconds and milliseconds from your timestamp to get a new date that's "rounded" to the day.

There is an example of 2 here.

Here is how you would do 1. I'm making an assumption that all your dates are this year, but you can easily adjust the math to accommodate your oldest date.

project1={$project:{_id:0, 
                   y:{$subtract:[{$year:"$TimeStamp"}, 2013]},
                   d:{$subtract:[{$dayOfYear:"$TimeStamp"},1]}, 
                   TimeStamp:1, 
                   jan1:{$literal:new ISODate("2013-01-01T00:00:00")}
         } };
project2={$project:{tsDate:{$add:[
                       "$jan1",
                       {$multiply:["$y", 365*24*60*60*1000]},
                       {$multiply:["$d", 24*60*60*1000]}
         ] } } };

Sample data:

db.foos.find({},{_id:0,TimeStamp:1})
{ "TimeStamp" : ISODate("2013-11-13T19:15:05.600Z") }
{ "TimeStamp" : ISODate("2014-02-01T10:00:00Z") }

Aggregation result:

> db.foos.aggregate(project1, project2)
{ "tsDate" : ISODate("2013-11-13T00:00:00Z") }
{ "tsDate" : ISODate("2014-02-01T00:00:00Z") }
like image 13
Asya Kamsky Avatar answered Oct 26 '22 19:10

Asya Kamsky