Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mongo average aggregation query with no group

I am trying to get the average of a whole field using the aggregation framework in Mongo. However i can't seem to find any example that uses it without a group parameter.

I have the following document structure:

 {
      "_id" : ObjectId("5352703b61d2739b2ea44e4d"),
      "Semana" : "2014-02-23 - 2014-03-01",
      "bolsaDeValores" : "7",
      "bvc" : "8",
      "dollar" : "76",
      "ecopetrol" : "51",
      "dollarPrice" : "18"
 }

Basically what i want to do is get the average value of the bvc field, and any other numeric one, for the whole collection in the fastest possible way (without using MapReduce as it is less efficient than the Aggregation Framework).

I have tried to group on a greater than zero basis as well but to no avail:

db.EvaluatedSentiments.aggregate([
    { "$group": { 
        "bvc" : {"$gt:0"}
        }, 
        {
            "bvc" : { "$avg" : "$bvc"}
        }
    }
])

I appreciate any help you could provide.

References: Mongo aggregation manual

like image 729
NicolasZ Avatar asked Apr 27 '14 12:04

NicolasZ


People also ask

How does MongoDB calculate average?

We can manually verify this is correct by calculating the average of the points values by hand: Average of Points: (30 + 30 + 20 + 25 + 25) / 5 = 26.

How do I count aggregates in MongoDB?

MongoDB aggregate $count query It transfers a document to the next stage that contains a count of the number of documents input to the stage. Here, the string is the name of the output field which has the count as its value. And, the string must be a non-empty string, not start with '$' and not contain '.

What is difference between $Group and $project in MongoDB?

$group is used to group input documents by the specified _id expression and for each distinct grouping, outputs a document. $project is used to pass along the documents with the requested fields to the next stage in the pipeline.

Which aggregation method is preferred for use by MongoDB?

The pipeline provides efficient data aggregation using native operations within MongoDB, and is the preferred method for data aggregation in MongoDB. The aggregation pipeline can operate on a sharded collection. The aggregation pipeline can use indexes to improve its performance during some of its stages.


2 Answers

First of all store numerical values as numbers. Afterwards you can use a simple statement to calculate the average:

db.collection.aggregate([{ 
  "$group": {
    "_id": null, 
    "avg_bvc": { "$avg": "$bvc" } 
  } 
}])

You can simply use more $avg aggregation operators to get averages for your other numeric fields:

db.collection.aggregate([{ 
  "$group": {
    "_id": null, 
    "avg_bvc": { "$avg": "$bvc" }, 
    "avg_dollar": { "$avg": "$dollar" } 
  } 
}])
like image 56
Sebastian Avatar answered Sep 30 '22 04:09

Sebastian


So if your data actually was numeric which is it not and your intention is to exclude the documents that have a "greater than zero" value then you include a $match statement in your aggregation pipeline in order to "filter" out these documents:

db.EvaluatedSentiments.aggregate([
    { "$match": {
        "bvc": { "$gt": 0 }
    }},
    { "$group": {
        "_id": null,
        "bvc": { "$avg": "$bvc" }
    }}
])
like image 38
Neil Lunn Avatar answered Sep 29 '22 04:09

Neil Lunn