MongoDB $lookup not using index

Tags:

I'm writing a query that requires a $lookup between two tables and as I understand it, it's essential that the foreignField have an index in order to perform this join in a timely fashion. However, even after adding an index on the field, the query is still falling back to COLLSCAN.

db.users.aggregate([
  {$lookup:{ from: "transactions", localField: '_id', foreignField: 'uid', as: 'transaction' }},
  { $match: { transaction: { "$size" : 0} } },
  { $count: "total"},
], { explain: true })

This returns:

"queryPlanner" : {
    "plannerVersion" : 1,
    "namespace" : "test.users",
    "indexFilterSet" : false,
    "parsedQuery" : {

    },
    "winningPlan" : {
        "stage" : "COLLSCAN",
        "direction" : "forward"
    },
    "rejectedPlans" : [ ]
}

As I mentioned, I do have the uid field indexed in the transactions collection:

> db.transactions.getIndexes()
[
    {
        "v" : 1,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "test.transactions"
    },
    {
        "v" : 1,
        "key" : {
            "uid" : 1
        },
        "name" : "uid_1",
        "ns" : "test.transactions"
    }
]

The query takes a few minutes to run in a DB of approximately 7M documents. I'm using MongoDB v3.4.7. Any idea as to what I could be doing wrong? Thanks in advance!

996

asked Aug 29 '17 12:08

ashe540

2 Answers

The "stage" : "COLLSCAN", isn't referring to the $lookup at all.

The first step in that aggregation pipeline is to fetch all of the documents from the 'users' collection. Since there are no filters at all provided for that, collection scan is the most efficient method.

The $lookup stage should be planned like any other query, and would likely use the index.

answered Oct 19 '22 04:10

Joe

Because your aggregation pipeline first stage has no $match or $sort or $geoNear query on indexed key and In $match stage you didn't query on any index key.

Case 1: If you do $match on indexed key in first stage, WinningPlan stage will be "FETCH" and stage of inputStage will be "IXSCAN"

"winningPlan" : {
    "stage" : "FETCH",
    "inputStage" : {
            "stage" : "IXSCAN",
        ...
    }
}

Case 2: If you do $match on non-indexed key in first stage, WinningPlan stage will be "COLLSCAN"

"winningPlan" : {
    "stage" : "COLLSCAN"
}

Case 3: If You do $match on index key after lookup (according to your query), WinningPlan stage will be "FETCH" and inputStage will be "IXSCAN".

Case 4: If You do $match on non-index key after lookup (just you did), WinningPlan stage will be "COLLSCAN".

For 7M record, you must use index in your queries. Don't do too much indexing, because they will stored in the RAM and you can't use $ne or $nin on indexed key properly.

Mongodb Docs: Optimizing Aggregation Pipeline

Mongodb Docs: Indexing Strategies

answered Oct 19 '22 03:10

Ashutosh

Related questions
                            
                                Strange Mongoose schema.js error - `options` may not be used as a schema pathname
                            
                                How to delete documents by query efficiently in mongo?
                            
                                Mongodb query specific month|year not date
                            
                                Aggregate query with where condition
                            
                                Java MongoDB FindOne to get last inserted record
                            
                                Can't extract geo keys, longitude/latitude is out of bounds
                            
                                Installing MongoDB in WSL
                            
                                How do I use SafeMode with the MongoDB C# driver
                            
                                MongoDB with Mongoid in Rails - Geospatial Indexing
                            
                                Check MongoDB php driver version
                            
                                Error with Mongo starting Meteor
                            
                                How to import Mongodb ObjectId from CSV file using mongoimport?
                            
                                DB with best inserts/sec performance? [closed]
                            
                                MongoDB performance of db.eval() in javascript shell vs. pymongo
                            
                                Saving image with mongoose
                            
                                Algorithm run from within Node HTTP request takes much longer to run
                            
                                Why is full text search of MongoDB shards directly much faster than going through the cluster manager (mongos) instance?
                            
                                Mongo aggregation and MongoError: exception: BufBuilder attempted to grow() to 134217728 bytes, past the 64MB limit
                            
                                MongoDB: How often should you open and close the database? Why?
                            
                                Index Bounds on Mongo Regex Search

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

MongoDB $lookup not using index

Tags:

mongodb

mongodb-query

aggregation-framework

mongodb-shell

ashe540

People also ask

2 Answers

Joe

Ashutosh

Recent Activity

Donate For Us