From http://docs.mongodb.org/manual/core/indexes/#multikey-indexes, it is possible to create an index on an array field using a multikey index. http://docs.mongodb.org/manual/applications/aggregation/#pipeline-operators-and-indexes lists some ways of how an index can be used in aggregation framework. However, there may be times that I may need to perform an $unwind
on an array field to perform a $group
. My question is, are multikey indexes (or any index using such array field) can still be used once they are operated on in the middle of the pipeline?
An aggregation pipeline consists of one or more stages that process documents: Each stage performs an operation on the input documents. For example, a stage can filter documents, group documents, and calculate values. The documents that are output from a stage are passed to the next stage.
Indexes can cover queries in an aggregation pipeline. A covered query uses an index to return all of the documents and has high performance.
Mongoid exposes MongoDB's aggregation pipeline, which is used to construct flows of operations that process and return results. The aggregation pipeline is a superset of the deprecated map/reduce framework functionality.
Generally, only pipeline operators that can be flattened to a normal query ($match
, $limit
, $sort
, and $skip
) will be able to use the indexes on a collection. This is one of the reasons the $geoNear
operator added in 2.4 has to be at the start of the pipeline.
Once you mutate the documents with $project
, $group
, or $unwind
the index is no longer valid/usable.
If you have an index on an array field you can still use it before the $unwind
to speed up the selection of documents to pipeline and then further refine the selected documents with a second $match
.
Consider documents like:
{ tags: [ 'cat', 'bird', 'blue' ] }
With an index on tags
.
If you only wanted to group the tags starting with b
then you could perform an aggregation like:
{ pipeline: [
{ $match : { tags : /^b/ } },
{ $unwind : '$tags' },
{ $match : { tags : /^b/ } },
/* the rest */
] }
The first $match
does the coarse grain match using the index on tags
.
The second match after the $unwind
won't be able to use the index (the document above is now 3 documents) but can evaluate each of those documents to filter out the extra documents that get created (to remove { tags : 'cat' } from the example).
HTH - Rob.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With