Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can the MongoDB aggregation framework $group return an array of values?

How flexible is the aggregate function for output formatting in MongoDB?

Data format:

{
        "_id" : ObjectId("506ddd1900a47d802702a904"),
        "port_name" : "CL1-A",
        "metric" : "772.0",
        "port_number" : "0",
        "datetime" : ISODate("2012-10-03T14:03:00Z"),
        "array_serial" : "12345"
}

Right now I'm using this aggregate function to return an array of DateTime, an array of metrics, and a count:

{$match : { 'array_serial' : array, 
                            'port_name' : { $in : ports},
                            'datetime' : { $gte : from, $lte : to}
                        }
                },
               {$project : { port_name : 1, metric : 1, datetime: 1}},
               {$group : { _id : "$port_name", 
                            datetime : { $push : "$datetime"},
                            metric : { $push : "$metric"},
                            count : { $sum : 1}}}

Which is nice, and very fast, but is there a way to format the output so there's one array per datetime/metric? Like this:

[
    {
      "_id" : "portname",
      "data" : [
                ["2012-10-01T00:00:00.000Z", 1421.01],
                ["2012-10-01T00:01:00.000Z", 1361.01],
                ["2012-10-01T00:02:00.000Z", 1221.01]
               ]
    }
]

This would greatly simplify the front-end as that's the format the chart code expects.

like image 376
Chris Matta Avatar asked Oct 08 '12 17:10

Chris Matta


People also ask

What does MongoDB aggregation return?

aggregate() method returns a cursor to the documents produced by the final stage of the aggregation pipeline operation, or if you include the explain option, the document that provides details on the processing of the aggregation operation.

What does $group do in MongoDB?

The $group stage separates documents into groups according to a "group key". The output is one document for each unique group key. A group key is often a field, or group of fields. The group key can also be the result of an expression.

What is aggregation framework in MongoDB?

Aggregation in MongoDB allows for the transforming of data and results in a more powerful fashion than from using the find() command. Through the use of multiple stages and expressions, you are able to build a "pipeline" of operations on your data to perform analytic operations.

What does Mongoose aggregate return?

Mongoose Aggregate Class Mongoose's aggregate() function returns an instance of Mongoose's Aggregate class. Aggregate instances are thenable, so you can use them with await and promise chaining. The Aggregate class also supports a chaining interface for building aggregation pipelines.


2 Answers

Combining two fields into an array of values with the Aggregation Framework is possible, but definitely isn't as straightforward as it could be (at least as at MongoDB 2.2.0).

Here is an example:

db.metrics.aggregate(

    // Find matching documents first (can take advantage of index)
    { $match : {
        'array_serial' : array, 
        'port_name' : { $in : ports},
        'datetime' : { $gte : from, $lte : to}
    }},

    // Project desired fields and add an extra $index for # of array elements
    { $project: {
        port_name: 1,
        datetime: 1,
        metric: 1,
        index: { $const:[0,1] }
    }},

    // Split into document stream based on $index
    { $unwind: '$index' },

    // Re-group data using conditional to create array [$datetime, $metric]
    { $group: {
        _id: { id: '$_id', port_name: '$port_name' },
        data: {
            $push: { $cond:[ {$eq:['$index', 0]}, '$datetime', '$metric'] }
        },
    }},

    // Sort results
    { $sort: { _id:1 } },

    // Final group by port_name with data array and count
    { $group: {
        _id: '$_id.port_name',
        data: { $push: '$data' },
        count: { $sum: 1 }
    }}
)
like image 155
Stennie Avatar answered Oct 04 '22 12:10

Stennie


MongoDB 2.6 made this a lot easier by introducing $map, which allows a simplier form of array transposition:

db.metrics.aggregate([
   { "$match": {
       "array_serial": array, 
       "port_name": { "$in": ports},
       "datetime": { "$gte": from, "$lte": to }
    }},
    { "$group": {
        "_id": "$port_name",
        "data": {
            "$push": {
                "$map": {
                    "input": [0,1],
                    "as": "index",
                    "in": {
                        "$cond": [
                            { "$eq": [ "$$index", 0 ] },
                            "$datetime",
                            "$metric"
                        ]
                    }
                }
            }
        },
        "count": { "$sum": 1 }
    }}
])

Where much like the approach with $unwind, you supply an array as "input" to the map operation consisting of two values and then essentially replace those values with the field values you want via the $cond operation.

This actually removes all the pipeline juggling required to transform the document as was required in previous releases and just leaves the actual aggregation to the job at hand, which is basically accumulating per "port_name" value, and the transformation to array is no longer a problem area.

like image 39
Blakes Seven Avatar answered Oct 04 '22 12:10

Blakes Seven