I have a collection with compound index on four fields in order: (A,B,C,D)
When I do query like
find({A: val1, B: val2, C: val3}).sort({D: 1}).limit(N)
with strict equals in fields A,B,C it runs very fast, as it should be. And explain()
tells me that just N documents was scanned.
If I change one of equals to $in
operator (with about 100 elements in array) it scans much more number of documents and runs more slowly:
find({A: {$in: [val0, val1, ...]}, B: val2, C: val3}).sort({D: 1}).limit(N)
Other operators like $or
have the same effect.
Logically one $in
with 100 elements must be very similar to 100 individual queries with strict equals. Second variant runs much more faster in the database but requires getting all the elements (without limit) with post-sorting and limiting on the clientside.
Does it make sense to split this one query with $in
into several queries with equals to make cursor scan less number of documents? What will be more efficient in case of millions of documents in the collection?
MongoDB uses indexing in order to make the query processing more efficient. If there is no indexing, then the MongoDB must scan every document in the collection and retrieve only those documents that match the query.
MongoDB supports compound indexes, where a single index structure holds references to multiple fields [1] within a collection's documents. The following diagram illustrates an example of a compound index on two fields: MongoDB imposes a limit of 32 fields for any compound index.
Or in other words, compound indexes are those indexes where a single index field contains references to multiple fields. In MongoDB, the compound index can contain a single hashed index field, if a field contains more than one hashed index field then MongoDB will give an error.
MongoDB can use the intersection of multiple indexes to fulfill queries. In general, each index intersection involves two indexes; however, MongoDB can employ multiple/nested index intersections to resolve a query.
Have you tested with the index {B:1,C:1,A:1,D:1} ? That way the exact B and C values can be processed quickly, a range can be used on the A field and sorting by D can still be done via the index.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With