Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculated group-by fields in MongoDB

For this example from the MongoDB documentation, how do I write the query using MongoTemplate?

db.sales.aggregate(
   [
      {
        $group : {
           _id : { month: { $month: "$date" }, day: { $dayOfMonth: "$date" }, year: { $year: "$date" } },
           totalPrice: { $sum: { $multiply: [ "$price", "$quantity" ] } },
           averageQuantity: { $avg: "$quantity" },
           count: { $sum: 1 }
        }
      }
   ]
)

Or in general, how do I group by a calculated field?

like image 475
Navin Viswanath Avatar asked Aug 22 '14 00:08

Navin Viswanath


People also ask

Can we do group by in MongoDB?

We can group by single as well as multiple field from the collection, we can use $group operator in MongoDB to group fields from the collection and returns the new document as result. We are using $avg, $sum, $max, $min, $push, $last, $first and $addToSet operator with group by in MongoDB.

How do I sum fields in MongoDB?

If used on a field that contains both numeric and non-numeric values, $sum ignores the non-numeric values and returns the sum of the numeric values. If used on a field that does not exist in any document in the collection, $sum returns 0 for that field. If all operands are non-numeric, $sum returns 0 .

What is _ID in Group MongoDB?

The _id expression specifies the group key. If you specify an _id value of null, or any other constant value, the $group stage returns a single document that aggregates values across all of the input documents. See the Group by Null example.


1 Answers

You can actually do something like this with "project" first, but to me it's a little counter-intuitive to require a $project stage before hand:

    Aggregation agg = newAggregation(
        project("quantity")
            .andExpression("dayOfMonth(date)").as("day")
            .andExpression("month(date)").as("month")
            .andExpression("year(date)").as("year")
            .andExpression("price * quantity").as("totalAmount"),
        group(fields().and("day").and("month").and("year"))
            .avg("quantity").as("averavgeQuantity")
            .sum("totalAmount").as("totalAmount")
            .count().as("count")
    );

Like I said, counter-intuitive as you should just be able to declare all of this under $group stage, but the helpers don't seem to work this way. The serialization comes out a bit funny ( wraps the date operator arguments with arrays ) but it does seem to work. But still, this is two pipeline stages rather than one.

What is the problem with this? Well by separating the stages the stages the "project" portion forces the processing of all of the documents in the pipeline in order to get the calculated fields, that means it passes through everything before moving on to the group stage.

The difference in processing time can be clearly seen by running the queries in both forms. With a separate project stage, on my hardware takes three times longer to execute than the query where all fields are calculated during the "group" operation.

So it seems the only present way to construct this properly is by building the pipeline object yourself:

    ApplicationContext ctx =
            new AnnotationConfigApplicationContext(SpringMongoConfig.class);
    MongoOperations mongoOperation = (MongoOperations) ctx.getBean("mongoTemplate");

    BasicDBList pipeline = new BasicDBList();
    String[] multiplier = { "$price", "$quantity" };

    pipeline.add(
        new BasicDBObject("$group",
            new BasicDBObject("_id",
                new BasicDBObject("month", new BasicDBObject("$month", "$date"))
                    .append("day", new BasicDBObject("$dayOfMonth", "$date"))
                    .append("year", new BasicDBObject("$year", "$date"))
            )
            .append("totalPrice", new BasicDBObject(
                "$sum", new BasicDBObject(
                    "$multiply", multiplier
                )
            ))
            .append("averageQuantity", new BasicDBObject("$avg", "$quantity"))
            .append("count",new BasicDBObject("$sum",1))
        )
    );

    BasicDBObject aggregation = new BasicDBObject("aggregate","collection")
        .append("pipeline",pipeline);

    System.out.println(aggregation);

    CommandResult commandResult = mongoOperation.executeCommand(aggregation);

Or if all of that seems to terse to you, then you can always work with the JSON source and parse that. But of course, it has to be valid JSON:

    String json = "[" +
        "{ \"$group\": { "+
            "\"_id\": { " +
                "\"month\": { \"$month\": \"$date\" }, " +
                "\"day\": { \"$dayOfMonth\":\"$date\" }, " +
                "\"year\": { \"$year\": \"$date\" } " +
            "}, " +
            "\"totalPrice\": { \"$sum\": { \"$multiply\": [ \"$price\", \"$quantity\" ] } }, " +
            "\"averageQuantity\": { \"$avg\": \"$quantity\" }, " +
            "\"count\": { \"$sum\": 1 } " +
        "}}" +
    "]";

    BasicDBList pipeline = (BasicDBList)com.mongodb.util.JSON.parse(json);
like image 198
Neil Lunn Avatar answered Sep 29 '22 10:09

Neil Lunn