Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Index optimization for mongodb aggregation framework

Tags:

I have a match-unwind-group-sort aggregation pipeline in mongo 2.4.4 and I need to speed up the aggregation.

The match operation consists of range queries on 16 fields. I've used the .explain() method to optimize range queries (i.e. create compound indexes). Is there a similar function for optimizing the aggregation? I'm looking for something like:

db.col.aggregate([]).explain() 

Also, am I right to focus on index optimization?

like image 886
Tom Swifty Avatar asked Oct 25 '13 13:10

Tom Swifty


People also ask

Does MongoDB aggregate use index?

$group does not use index data. From the mongoDB docs: The $match and $sort pipeline operators can take advantage of an index when they occur at the beginning of the pipeline.

How can you optimize an aggregation pipeline to perform faster?

Pipeline Sequence Optimization For an aggregation pipeline that contains a projection stage ( $project or $unset or $addFields or $set ) followed by a $match stage, MongoDB moves any filters in the $match stage that do not require values computed in the projection stage to a new $match stage before the projection.

Which index is faster in MongoDB?

Create Indexes to Support Queries If a query searches multiple fields, create a compound index. Scanning an index is much faster than scanning a collection. The indexes structures are smaller than the documents reference, and store references in order.


1 Answers

For the first question, yes, you can explain aggregates.

db.collection.runCommand("aggregate", {pipeline: YOUR_PIPELINE, explain: true}) 

For the second one, the indexes you create to optimize the range queries will also apply to the $match stage of the aggregation pipeline, if they occur at the beginning of the pipeline. So you are right to focus on index optimizations.

See Pipeline Operators and Indexes.

Update 2

More about aggregate and explain: on version 2.4 it is unreliable; on 2.6+ it does not provide query execution data. https://groups.google.com/forum/#!topic/mongodb-user/2LzAkyaNqe0

Update 1

Transcript of an aggregation explain on MongoDB 2.4.5.

$ mongo so MongoDB shell version: 2.4.5 connecting to: so > db.q19329239.runCommand("aggregate", {pipeline: [{$group: {_id: '$user.id', hits: {$sum: 1}}}, {$match: {hits: {$gt: 10}}}], explain: true}) {     "serverPipeline" : [         {             "query" : {              },             "projection" : {                 "user.id" : 1,                 "_id" : 0             },             "cursor" : {                 "cursor" : "BasicCursor",                 "isMultiKey" : false,                 "n" : 1031,                 "nscannedObjects" : 1031,                 "nscanned" : 1031,                 "nscannedObjectsAllPlans" : 1031,                 "nscannedAllPlans" : 1031,                 "scanAndOrder" : false,                 "indexOnly" : false,                 "nYields" : 0,                 "nChunkSkips" : 0,                 "millis" : 0,                 "indexBounds" : {                  },                 "allPlans" : [                     {                         "cursor" : "BasicCursor",                         "n" : 1031,                         "nscannedObjects" : 1031,                         "nscanned" : 1031,                         "indexBounds" : {                          }                     }                 ],                 "server" : "ficrm-rafa.local:27017"             }         },         {             "$group" : {                 "_id" : "$user.id",                 "hits" : {                     "$sum" : {                         "$const" : 1                     }                 }             }         },         {             "$match" : {                 "hits" : {                     "$gt" : 10                 }             }         }     ],     "ok" : 1 } 

Server version.

$ mongo so MongoDB shell version: 2.4.5 connecting to: so > db.version() 2.4.5 
like image 71
Rafa Avatar answered Sep 20 '22 10:09

Rafa