Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Group By aggregations on MongoDb

I use mongodb 2.6. I have stored my data in this form:

{ 
    "_id" : "my-sensor-1", 
    "points": [ 
        { "timeStamp" : NumberLong("1453468362174"), "value" : 41 }, 
        { "timeStamp" : NumberLong("1453468483297"), "value" : 66 }, 
        { "timeStamp" : NumberLong("1453468485568"), "value" : 49 },
         ...    
    ] 
}

In order to aggregate the documents I make queries like this:

db.pointsTest.aggregate([
    { $match: { $and: [ {"points.timeStamp" : { $gt : 1453433925163}}, 
                        {"_id":"my-sensor-10"} ] } },
    {"$unwind":"$points"},
    {$group: {_id: "my-sensor-1","average":{$avg : "$points.value"}}}
])

{ "_id" : "my-sensor-1", "average" : 52 }

Result

I have stored the timestamp as milliseconds so every time I want to aggregate a specific time interval I have to change the bounds of timeStamp value.

How could I make the aggregation on a time period and group the results by gap intervals (i.e aggregate the average value from now() -1day GROUP by 1h)?

EDIT

I want to make something like this:

db.pointsTest.aggregate([
    { $match: { $and: [ {"points.timeStamp" : { $gt : 1453433925163, $lt : 1453555555555}}, {"_id":"my-sensor-10"} ] } }, {"$unwind":"$points"}, {$group: {_id: "my-sensor-1","average":{$avg : "$points.value"}, ???}}
])

and the result would be the average of this time interval grouped by 1h. Assuming that I want to aggregate the average values for every hour of the 31 of december:

timestamp of interval 31/12/2015 20:00:00, average: xyz

timestamp of interval 31/12/2015 21:00:00, average: xyz

At this moment in order to achieve that I have to split the time interval in 1hour intervals and make several requests to the database.

I.E using InfluxDB to do the same I do this:

"SELECT MEAN(value) From myMeasures where key='my-sensor-1' and time > now() - 1d GROUP BY time(1h)"
like image 907
bill Avatar asked Jan 22 '16 13:01

bill


People also ask

Can we use group by in MongoDB?

MongoDB group by is used to group data from the collection, we can achieve group by clause using aggregate function and group method in MongoDB. While using aggregate function with group by clause query operations is faster as normal query, basically aggregate function is used in multiple condition.

How can you group by a particular value in MongoDB?

mongo. db; var pipeline = [ { "$group": { "_id": "$card_id", "likes": { "$sum": { "$cond": [ { "$eq": [ "$vote", 1 ] }, 1, 0 ] } }, "dislikes": { "$sum": { "$cond": [ { "$eq": [ "$vote", 2 ] }, 1, 0 ] } }, "total": { "$sum": { "$cond": [ { "$eq": [ "$vote", 1 ] }, 1, -1 ] } } }}, { "$sort": { "total": -1 } } ]; db.

How do I write a group condition in MongoDB?

The use of the $cond operator here determines whether the "status" is actually a defect or not and the result is a conditional $sum where only the "defect" values are counted. Once those are grouped per day you simply $divide the result, with another check with $cond to make sure you are not dividing by zero.


3 Answers

You need to do some mathematical calculations in Mongo query to group data based on different interval.

The combination of $subtract and $mod will help you to group data in specific interval.

The query will be like following:

db.sensor.aggregate({
    $match: {
    $and: [{
        "points.timeStamp": {
            $gt: 1453433925163,
            $lt: 1453555555555
        }
    }, {
        "_id": "my-sensor-1"
    }]
    }
}, {
    $unwind: "$points"
}, {
    "$group": {
    "_id": {
        "$subtract": ["$points.timeStamp", {
            "$mod": ["$points.timeStamp", 1000 * 60]
        }]
    },
    "average": {
        "$avg": "$points.value"
    }
    }
})

Hope, this will be helpful to you.

like image 188
Vishwas Avatar answered Nov 25 '22 18:11

Vishwas


To get the results in gap intervals, you can make use of javascript functions to support your query, since the mongodb shell supports them:

To perform the example with the example you provided, where you want to $match values from now()-1day, you can do this before your aggregation:

var now = new Date();
var yesterday = new Date();
// using getHours and setHours, since the Date object doesnt have method for getDays and setDays
yesterday.setHours(now.getHours() - 24);

yesterday.getTime() will yield the date in milis, which you can use it in your aggregation in the $match phase

db.pointsTest.aggregate([
    { $match: { $and: [ {"points.timeStamp" : { $gt : yesterday.getTime()}}, 
                        {"_id":"my-sensor-10"} ] } },

Now to group results in hourly intervals, i'd add a $project stage before the $group, to add a new field, where the hourly interval is calculated, with points.timeStamp - yesterday.getTime() you'll get the total diference of miliseconds between the initial time and the entry time, you transform that to hours and round up to the next integer value.

And finally, use that new field from the $project phase, to be used in the $group stage.

like image 38
saljuama Avatar answered Nov 25 '22 16:11

saljuama


You can easily do it with the mongodb mapReduce.

Try the following code:

// generate a query to filter result by date and _id.
// be aware that our query matches documents that contain an array field with 
// at least one element that matches all the specified criteria.
var yesterday = new Date();
yesterday.setDate(yesterday.getDate()-1);
var query = {"points.timeStamp" : { $gt : yesterday.getTime()}, "_id":"my-sensor-1"};

var map = function(){
    var points = this.points;
    for(var i=0;i<points.length;i++){
        var date = new Date(points[i].timeStamp);

        //remove minutes, seconds and milliseconds from the date and emit it
        date.setHours(date.getHours(), 0, 0, 0);
        emit(date, points[i].value);
    }
};

var reduce = function(key, values){
    //calculate average
    var total = 0;
    for(var i = 0; i < values.length; i++) {
        total += values[i];
    }
    var avg = total / values.length;
    return avg;
};

db.pointsTest.mapReduce(map, reduce, {out:{inline: 1}, query: query})
like image 20
Volodymyr Synytskyi Avatar answered Nov 25 '22 16:11

Volodymyr Synytskyi