Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB $lookup not using index

I'm writing a query that requires a $lookup between two tables and as I understand it, it's essential that the foreignField have an index in order to perform this join in a timely fashion. However, even after adding an index on the field, the query is still falling back to COLLSCAN.

db.users.aggregate([
  {$lookup:{ from: "transactions", localField: '_id', foreignField: 'uid', as: 'transaction' }},
  { $match: { transaction: { "$size" : 0} } },
  { $count: "total"},
], { explain: true })

This returns:

"queryPlanner" : {
    "plannerVersion" : 1,
    "namespace" : "test.users",
    "indexFilterSet" : false,
    "parsedQuery" : {

    },
    "winningPlan" : {
        "stage" : "COLLSCAN",
        "direction" : "forward"
    },
    "rejectedPlans" : [ ]
}

As I mentioned, I do have the uid field indexed in the transactions collection:

> db.transactions.getIndexes()
[
    {
        "v" : 1,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "test.transactions"
    },
    {
        "v" : 1,
        "key" : {
            "uid" : 1
        },
        "name" : "uid_1",
        "ns" : "test.transactions"
    }
]

The query takes a few minutes to run in a DB of approximately 7M documents. I'm using MongoDB v3.4.7. Any idea as to what I could be doing wrong? Thanks in advance!

like image 996
ashe540 Avatar asked Aug 29 '17 12:08

ashe540


People also ask

Why index is not used in MongoDB?

Indexes support the efficient execution of queries in MongoDB. Without indexes, MongoDB must perform a collection scan, i.e. scan every document in a collection, to select those documents that match the query statement.

Does MongoDB $in use index?

Generally, MongoDB only uses one index to fulfill most queries. However, each clause of an $or query may use a different index, and in addition, MongoDB can use an intersection of multiple indexes.

Does MongoDB aggregate use index?

$group does not use index data. From the mongoDB docs: The $match and $sort pipeline operators can take advantage of an index when they occur at the beginning of the pipeline.

What is $lookup in MongoDB?

$lookup performs an equality match on the localField to the foreignField from the documents of the from collection. If an input document does not contain the localField , the $lookup treats the field as having a value of null for matching purposes.


2 Answers

The "stage" : "COLLSCAN", isn't referring to the $lookup at all.

The first step in that aggregation pipeline is to fetch all of the documents from the 'users' collection. Since there are no filters at all provided for that, collection scan is the most efficient method.

The $lookup stage should be planned like any other query, and would likely use the index.

like image 85
Joe Avatar answered Oct 19 '22 04:10

Joe


Because your aggregation pipeline first stage has no $match or $sort or $geoNear query on indexed key and In $match stage you didn't query on any index key.

Case 1: If you do $match on indexed key in first stage, WinningPlan stage will be "FETCH" and stage of inputStage will be "IXSCAN"

"winningPlan" : {
    "stage" : "FETCH",
    "inputStage" : {
            "stage" : "IXSCAN",
        ...
    }
}

Case 2: If you do $match on non-indexed key in first stage, WinningPlan stage will be "COLLSCAN"

"winningPlan" : {
    "stage" : "COLLSCAN"
}

Case 3: If You do $match on index key after lookup (according to your query), WinningPlan stage will be "FETCH" and inputStage will be "IXSCAN".

Case 4: If You do $match on non-index key after lookup (just you did), WinningPlan stage will be "COLLSCAN".

For 7M record, you must use index in your queries. Don't do too much indexing, because they will stored in the RAM and you can't use $ne or $nin on indexed key properly.

Mongodb Docs: Optimizing Aggregation Pipeline

Mongodb Docs: Indexing Strategies

like image 26
Ashutosh Avatar answered Oct 19 '22 03:10

Ashutosh