I created a database with a single collection that stores documents with only 2 fields (and an id):
public class Hamster
{
public ObjectId Id;
public string Name;
public int Age;
}
I also created an index for each field.
When I execute a query filtering on both fields I expect it to combine both indexes using Index Intersection to reduce the collection scanning and improve performance. This is never the case. I haven't yet managed to induce an index intersection.
So, what stops MongoDB
from applying index intersection?
Indexes support the efficient execution of queries in MongoDB. Without indexes, MongoDB must perform a collection scan, i.e. scan every document in a collection, to select those documents that match the query statement.
MongoDB provides a method called createIndex() that allows user to create an index. The key determines the field on the basis of which you want to create an index and 1 (or -1) determines the order in which these indexes will be arranged(ascending or descending).
MongoDB can use the intersection of multiple indexes to fulfill queries. In general, each index intersection involves two indexes; however, MongoDB can employ multiple/nested index intersections to resolve a query.
Which of the following method is to verify whether MongoDB used index intersection? Explanation: To determine if MongoDB used index intersection, run explain(); the results of explain() will include either an AND_SORTED stage or an AND_HASH stage.
When you use explain(true)
you can see that the optimizer considers using index intersection and chooses not to:
"cursor" : "BtreeCursor Age", // Chosen plan.
...
"allPlans" : [
{
"cursor" : "BtreeCursor Age",
...
},
{
"cursor" : "BtreeCursor Name",
...
},
{
"cursor" : "Complex Plan", // Index intersection.
...
}
]
MongoDB
will never choose intersection if there's a sufficient compound index. Other limitations can be found on the Jira ticket for Index Intersection:
The query optimizer may select index intersection plans when the following conditions hold:
1. Most of the documents in the relevant collection are disk-resident. The advantage of index intersection is that it can avoid fetching complete documents when the size of the intersection is small. If the documents are already in memory, there is nothing to gain by avoiding fetches.
2. The query predicates are single point intervals, rather than range predicates or a set of intervals. Queries over single point intervals return documents sorted by disk location, which allows the optimizer to select plans that compute the intersection in a non-blocking fashion. This is generally faster than the alternative mode of computing the intersection, which is to build a hash table with the results from one index, and then probe it with the results from the second index.
3. Neither of the indices to be intersected are highly selective. If one of the indices is selective then the optimizer will choose a plan which simply scans this selective index.
4. The size of the intersection is small relative to the number of index keys scanned by either single-index solution. In this case the query executor can look at a smaller set of documents using index intersection, potentially allowing us to reap the benefits of fewer fetches from disk.
MongoDB
has many limitations on intersection which makes it less likely to be actually used.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With