Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why $nin is slower than $in, Mongodb

I have collection with 5M documents with correct indexes.$in working perfect, but same query with $nin super slow...What of the reason of this?

Super fast:

{'tech': {'$in': ['Wordpress', 'wordpress', 'WORDPRESS']}}

Super slow..

{'tech': {'$nin': ['Wordpress', 'wordpress', 'WORDPRESS']}}
like image 529
Максим Дихтярь Avatar asked Oct 31 '25 01:10

Максим Дихтярь


1 Answers


The following explanation is accurate only for Mongo versions prior to 3.2

Mongo v3.2 has all kinds of storage engine changes which improved performance on this issue.

Now $nin hash one important quality, which is it not a selective query, First let's understand what selectivity means:

Selectivity is the ability of a query to narrow results using the index. Effective indexes are more selective and allow MongoDB to use the index for a larger portion of the work associated with fulfilling the query.

Now they even state it themselfs:

For instance, the inequality operators $nin and $ne are not very selective since they often match a large portion of the index. As a result, in many cases, a $nin or $ne query with an index may perform no better than a $nin or $ne query that must scan all documents in a collection.

Back then selectivity was a big deal performance wise. This all leads us to your question, why isn't the index being used?

Well when Mongo is asked to create a query plan he preforms a "race" between all available query plans, one of which is a COLSCAN i.e collection scan where the first plan to find 101 documents wins. Due to the poor efficiency of non-selective query's the winning plan (And actually usually faster depending on the index and values in the query) is COLSCAN, Read further about this here

like image 100
Tom Slabbaert Avatar answered Nov 02 '25 16:11

Tom Slabbaert