Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mongodb - poor performance when no results return

I have Mongodb collection with about 7 million documents that represents places.

I run a query that search for places that their name start with a prefix near a specific location.

We have a compound index as described bellow to speed up the search.

When the search query find match (even if only one) the query is execute very fast (~20 milisec). But when there is no match it can take 30 sec for the query to execute.

Please assist.

In detailed:

Each place (geoData) has the following fields:

"loc" - a GeoJSON point that represent the location
"categoriesIds" - array of int ids
"name" - the name of the placee

The following index is defined on this collection:

{
  "loc" : "2dsphere",
  "categoriesIds" : 1,
  "name" : 1
}

The query is:

db.geoData.find({
  "loc":{
    "$near":{
      "$geometry":{
        "type": "Point" ,
        "coordinates": [ -0.10675191879272461 , 51.531600743186644]
      },
      "$maxDistance": 5000.0
    }
  }, 
  "categoriesIds":{
    "$in": [ 1 , 2 , 71 , 70 , 74 , 72 , 73 , 69 , 44 , 26 , 27 , 33 , 43 , 45 , 53 , 79]
  }, 
  "name":{ "$regex": "^Cafe Ne"}
})

Execution stats (Link to the whole explain result)

    "executionStats" : {
    "executionSuccess" : true,
    "nReturned" : 1,
    "executionTimeMillis" : 169,
    "totalKeysExamined" : 14333,
    "totalDocsExamined" : 1,
    "executionStages" : {
        "stage" : "GEO_NEAR_2DSPHERE",
        "nReturned" : 1,
        "executionTimeMillisEstimate" : 60,
        "works" : 14354,
        "advanced" : 1,
        "needTime" : 14351,
        "needFetch" : 0,
        "saveState" : 361,
        "restoreState" : 361,
        "isEOF" : 1,
        "invalidates" : 0,
        "keyPattern" : {
            "loc" : "2dsphere",
            "categoriesIds" : 1,
            "name" : 1
        },
        "indexName" : "loc_2dsphere_categoriesIds_1_name_1",
        "searchIntervals" : [ 
            {
                "minDistance" : 0,
                "maxDistance" : 3408.329295346151,
                "maxInclusive" : false
            }, 
            {
                "minDistance" : 3408.329295346151,
                "maxDistance" : 5000,
                "maxInclusive" : true
            }
        ],
        "inputStages" : [ 
            {
                "stage" : "FETCH",
                "nReturned" : 1,
                "executionTimeMillisEstimate" : 20,
                "works" : 6413,
                "advanced" : 1,
                "needTime" : 6411,
                "needFetch" : 0,
                "saveState" : 361,
                "restoreState" : 361,
                "isEOF" : 1,
                "invalidates" : 0,
                "docsExamined" : 1,
                "alreadyHasObj" : 0,
                "inputStage" : {
                    "stage" : "IXSCAN",
                    "filter" : {
                        "TwoDSphereKeyInRegionExpression" : true
                    },
                    "nReturned" : 1,
                    "executionTimeMillisEstimate" : 20,
                    "works" : 6413,
                    "advanced" : 1,
                    "needTime" : 6411,
                    "needFetch" : 0,
                    "saveState" : 361,
                    "restoreState" : 361,
                    "isEOF" : 1,
                    "invalidates" : 0,
                    "keyPattern" : {
                        "loc" : "2dsphere",
                        "categoriesIds" : 1,
                        "name" : 1
                    },
                    "indexName" : "loc_2dsphere_categoriesIds_1_name_1",
                    "isMultiKey" : true,
                    "direction" : "forward",
                    "indexBounds" : {
                        "loc" : [ 
                            "[\"2f1003230\", \"2f1003230\"]", 
                            "[\"2f10032300\", \"2f10032300\"]", 
                            "[\"2f100323000\", \"2f100323000\"]", 
                            "[\"2f1003230001\", \"2f1003230001\"]", 
                            "[\"2f10032300012\", \"2f10032300013\")", 
                            "[\"2f1003230002\", \"2f1003230002\"]", 
                            "[\"2f10032300021\", \"2f10032300022\")", 
                            "[\"2f10032300022\", \"2f10032300023\")", 
                            "[\"2f100323003\", \"2f100323003\"]", 
                            "[\"2f1003230031\", \"2f1003230031\"]", 
                            "[\"2f10032300311\", \"2f10032300312\")", 
                            "[\"2f10032300312\", \"2f10032300313\")", 
                            "[\"2f10032300313\", \"2f10032300314\")", 
                            "[\"2f1003230032\", \"2f1003230032\"]", 
                            "[\"2f10032300320\", \"2f10032300321\")", 
                            "[\"2f10032300321\", \"2f10032300322\")"
                        ],
                        "categoriesIds" : [ 
                            "[1.0, 1.0]", 
                            "[2.0, 2.0]", 
                            "[26.0, 26.0]", 
                            "[27.0, 27.0]", 
                            "[33.0, 33.0]", 
                            "[43.0, 43.0]", 
                            "[44.0, 44.0]", 
                            "[45.0, 45.0]", 
                            "[53.0, 53.0]", 
                            "[69.0, 69.0]", 
                            "[70.0, 70.0]", 
                            "[71.0, 71.0]", 
                            "[72.0, 72.0]", 
                            "[73.0, 73.0]", 
                            "[74.0, 74.0]", 
                            "[79.0, 79.0]"
                        ],
                        "name" : [ 
                            "[\"Cafe Ne\", \"Cafe Nf\")", 
                            "[/^Cafe Ne/, /^Cafe Ne/]"
                        ]
                    },
                    "keysExamined" : 6412,
                    "dupsTested" : 0,
                    "dupsDropped" : 0,
                    "seenInvalidated" : 0,
                    "matchTested" : 1
                }
            }, 
            {
                "stage" : "FETCH",
                "nReturned" : 0,
                "executionTimeMillisEstimate" : 40,
                "works" : 7922,
                "advanced" : 0,
                "needTime" : 7921,
                "needFetch" : 0,
                "saveState" : 261,
                "restoreState" : 261,
                "isEOF" : 1,
                "invalidates" : 0,
                "docsExamined" : 0,
                "alreadyHasObj" : 0,
                "inputStage" : {
                    "stage" : "IXSCAN",
                    "filter" : {
                        "TwoDSphereKeyInRegionExpression" : true
                    },
                    "nReturned" : 0,
                    "executionTimeMillisEstimate" : 40,
                    "works" : 7922,
                    "advanced" : 0,
                    "needTime" : 7921,
                    "needFetch" : 0,
                    "saveState" : 261,
                    "restoreState" : 261,
                    "isEOF" : 1,
                    "invalidates" : 0,
                    "keyPattern" : {
                        "loc" : "2dsphere",
                        "categoriesIds" : 1,
                        "name" : 1
                    },
                    "indexName" : "loc_2dsphere_categoriesIds_1_name_1",
                    "isMultiKey" : true,
                    "direction" : "forward",
                    "indexBounds" : {
                        "loc" : [ 
                            "[\"2f1003230\", \"2f1003230\"]", 
                            "[\"2f10032300\", \"2f10032300\"]", 
                            "[\"2f100323000\", \"2f100323000\"]", 
                            "[\"2f1003230001\", \"2f1003230001\"]", 
                            "[\"2f10032300011\", \"2f10032300012\")", 
                            "[\"2f10032300012\", \"2f10032300013\")", 
                            "[\"2f1003230002\", \"2f1003230002\"]", 
                            "[\"2f10032300021\", \"2f10032300022\")", 
                            "[\"2f10032300022\", \"2f10032300023\")", 
                            "[\"2f100323003\", \"2f100323003\"]", 
                            "[\"2f1003230031\", \"2f1003230032\")", 
                            "[\"2f1003230032\", \"2f1003230032\"]", 
                            "[\"2f10032300320\", \"2f10032300321\")", 
                            "[\"2f10032300321\", \"2f10032300322\")", 
                            "[\"2f10032300322\", \"2f10032300323\")"
                        ],
                        "categoriesIds" : [ 
                            "[1.0, 1.0]", 
                            "[2.0, 2.0]", 
                            "[26.0, 26.0]", 
                            "[27.0, 27.0]", 
                            "[33.0, 33.0]", 
                            "[43.0, 43.0]", 
                            "[44.0, 44.0]", 
                            "[45.0, 45.0]", 
                            "[53.0, 53.0]", 
                            "[69.0, 69.0]", 
                            "[70.0, 70.0]", 
                            "[71.0, 71.0]", 
                            "[72.0, 72.0]", 
                            "[73.0, 73.0]", 
                            "[74.0, 74.0]", 
                            "[79.0, 79.0]"
                        ],
                        "name" : [ 
                            "[\"Cafe Ne\", \"Cafe Nf\")", 
                            "[/^Cafe Ne/, /^Cafe Ne/]"
                        ]
                    },
                    "keysExamined" : 7921,
                    "dupsTested" : 0,
                    "dupsDropped" : 0,
                    "seenInvalidated" : 0,
                    "matchTested" : 0
                }
            }
        ]
    },

Execution stats when searching for "CafeNeeNNN" instead of "Cafe Ne" (Link to the whole explain result )

 "executionStats" : {
    "executionSuccess" : true,
    "nReturned" : 0,
    "executionTimeMillis" : 2537,
    "totalKeysExamined" : 232259,
    "totalDocsExamined" : 162658,
    "executionStages" : {
        "stage" : "FETCH",
        "filter" : {
            "$and" : [ 
                {
                    "name" : /^CafeNeeNNN/
                }, 
                {
                    "categoriesIds" : {
                        "$in" : [ 
                            1, 
                            2, 
                            26, 
                            27, 
                            33, 
                            43, 
                            44, 
                            45, 
                            53, 
                            69, 
                            70, 
                            71, 
                            72, 
                            73, 
                            74, 
                            79
                        ]
                    }
                }
            ]
        },
        "nReturned" : 0,
        "executionTimeMillisEstimate" : 1330,
        "works" : 302752,
        "advanced" : 0,
        "needTime" : 302750,
        "needFetch" : 0,
        "saveState" : 4731,
        "restoreState" : 4731,
        "isEOF" : 1,
        "invalidates" : 0,
        "docsExamined" : 70486,
        "alreadyHasObj" : 70486,
        "inputStage" : {
            "stage" : "GEO_NEAR_2DSPHERE",
            "nReturned" : 70486,
            "executionTimeMillisEstimate" : 1290,
            "works" : 302751,
            "advanced" : 70486,
            "needTime" : 232264,
            "needFetch" : 0,
            "saveState" : 4731,
            "restoreState" : 4731,
            "isEOF" : 1,
            "invalidates" : 0,
            "keyPattern" : {
                "loc" : "2dsphere"
            },
            "indexName" : "loc_2dsphere",
            "searchIntervals" : [ 
                {
                    "minDistance" : 0,
                    "maxDistance" : 3408.329295346151,
                    "maxInclusive" : false
                }, 
                {
                    "minDistance" : 3408.329295346151,
                    "maxDistance" : 5000,
                    "maxInclusive" : true
                }
            ],
            "inputStages" : [ 
                {
                    "stage" : "FETCH",
                    "nReturned" : 44540,
                    "executionTimeMillisEstimate" : 110,
                    "works" : 102690,
                    "advanced" : 44540,
                    "needTime" : 58149,
                    "needFetch" : 0,
                    "saveState" : 4731,
                    "restoreState" : 4731,
                    "isEOF" : 1,
                    "invalidates" : 0,
                    "docsExamined" : 44540,
                    "alreadyHasObj" : 0,
                    "inputStage" : {
                        "stage" : "IXSCAN",
                        "filter" : {
                            "TwoDSphereKeyInRegionExpression" : true
                        },
                        "nReturned" : 44540,
                        "executionTimeMillisEstimate" : 90,
                        "works" : 102690,
                        "advanced" : 44540,
                        "needTime" : 58149,
                        "needFetch" : 0,
                        "saveState" : 4731,
                        "restoreState" : 4731,
                        "isEOF" : 1,
                        "invalidates" : 0,
                        "keyPattern" : {
                            "loc" : "2dsphere"
                        },
                        "indexName" : "loc_2dsphere",
                        "isMultiKey" : false,
                        "direction" : "forward",
                        "indexBounds" : {
                            "loc" : [ 
                                "[\"2f1003230\", \"2f1003230\"]", 
                                "[\"2f10032300\", \"2f10032300\"]", 
                                "[\"2f100323000\", \"2f100323000\"]", 
                                "[\"2f1003230001\", \"2f1003230001\"]", 
                                "[\"2f10032300012\", \"2f10032300013\")", 
                                "[\"2f1003230002\", \"2f1003230002\"]", 
                                "[\"2f10032300021\", \"2f10032300022\")", 
                                "[\"2f10032300022\", \"2f10032300023\")", 
                                "[\"2f100323003\", \"2f100323003\"]", 
                                "[\"2f1003230031\", \"2f1003230031\"]", 
                                "[\"2f10032300311\", \"2f10032300312\")", 
                                "[\"2f10032300312\", \"2f10032300313\")", 
                                "[\"2f10032300313\", \"2f10032300314\")", 
                                "[\"2f1003230032\", \"2f1003230032\"]", 
                                "[\"2f10032300320\", \"2f10032300321\")", 
                                "[\"2f10032300321\", \"2f10032300322\")"
                            ]
                        },
                        "keysExamined" : 102689,
                        "dupsTested" : 0,
                        "dupsDropped" : 0,
                        "seenInvalidated" : 0,
                        "matchTested" : 44540
                    }
                }, 
                {
                    "stage" : "FETCH",
                    "nReturned" : 47632,
                    "executionTimeMillisEstimate" : 250,
                    "works" : 129571,
                    "advanced" : 47632,
                    "needTime" : 81938,
                    "needFetch" : 0,
                    "saveState" : 2556,
                    "restoreState" : 2556,
                    "isEOF" : 1,
                    "invalidates" : 0,
                    "docsExamined" : 47632,
                    "alreadyHasObj" : 0,
                    "inputStage" : {
                        "stage" : "IXSCAN",
                        "filter" : {
                            "TwoDSphereKeyInRegionExpression" : true
                        },
                        "nReturned" : 47632,
                        "executionTimeMillisEstimate" : 230,
                        "works" : 129571,
                        "advanced" : 47632,
                        "needTime" : 81938,
                        "needFetch" : 0,
                        "saveState" : 2556,
                        "restoreState" : 2556,
                        "isEOF" : 1,
                        "invalidates" : 0,
                        "keyPattern" : {
                            "loc" : "2dsphere"
                        },
                        "indexName" : "loc_2dsphere",
                        "isMultiKey" : false,
                        "direction" : "forward",
                        "indexBounds" : {
                            "loc" : [ 
                                "[\"2f1003230\", \"2f1003230\"]", 
                                "[\"2f10032300\", \"2f10032300\"]", 
                                "[\"2f100323000\", \"2f100323000\"]", 
                                "[\"2f1003230001\", \"2f1003230001\"]", 
                                "[\"2f10032300011\", \"2f10032300012\")", 
                                "[\"2f10032300012\", \"2f10032300013\")", 
                                "[\"2f1003230002\", \"2f1003230002\"]", 
                                "[\"2f10032300021\", \"2f10032300022\")", 
                                "[\"2f10032300022\", \"2f10032300023\")", 
                                "[\"2f100323003\", \"2f100323003\"]", 
                                "[\"2f1003230031\", \"2f1003230032\")", 
                                "[\"2f1003230032\", \"2f1003230032\"]", 
                                "[\"2f10032300320\", \"2f10032300321\")", 
                                "[\"2f10032300321\", \"2f10032300322\")", 
                                "[\"2f10032300322\", \"2f10032300323\")"
                            ]
                        },
                        "keysExamined" : 129570,
                        "dupsTested" : 0,
                        "dupsDropped" : 0,
                        "seenInvalidated" : 0,
                        "matchTested" : 47632
                    }
                }
            ]
        }
    },

Indexes on the collection

{
"0" : {
    "v" : 1,
    "key" : {
        "_id" : 1
    },
    "name" : "_id_",
    "ns" : "wego.geoData"
},
"1" : {
    "v" : 1,
    "key" : {
        "srcId" : 1
    },
    "name" : "srcId_1",
    "ns" : "wego.geoData"
},
"2" : {
    "v" : 1,
    "key" : {
        "loc" : "2dsphere"
    },
    "name" : "loc_2dsphere",
    "ns" : "wego.geoData",
    "2dsphereIndexVersion" : 2
},
"3" : {
    "v" : 1,
    "key" : {
        "name" : 1
    },
    "name" : "name_1",
    "ns" : "wego.geoData"
},
"4" : {
    "v" : 1,
    "key" : {
        "loc" : "2dsphere",
        "categoriesIds" : 1,
        "name" : 1
    },
    "name" : "loc_2dsphere_categoriesIds_1_name_1",
    "ns" : "wego.geoData",
    "2dsphereIndexVersion" : 2
},
"5" : {
    "v" : 1,
    "key" : {
        "loc" : "2dsphere",
        "categoriesIds" : 1,
        "keywords" : 1
    },
    "name" : "loc_2dsphere_categoriesIds_1_keywords_1",
    "ns" : "wego.geoData",
    "2dsphereIndexVersion" : 2
}
}

Collection stats link

like image 947
Eliezer Avatar asked Nov 23 '15 11:11

Eliezer


People also ask

What method can we use to maximize performance and prevent MongoDB from returning more results than required for processing?

You must apply limit() to the cursor before retrieving any documents from the database. Use limit() to maximize performance and prevent MongoDB from returning more results than required for processing.

What makes MongoDB slow?

The slow queries can happen when you do not have proper DB indexes. Indexes support the efficient execution of queries in MongoDB. Without indexes, MongoDB must perform a collection scan, i.e. scan every document in a collection, to select those documents that match the query statement.

Does indexing improve query performance MongoDB?

Indexes also improve efficiency on queries that routinely sort on a given field. Because MongoDB can read indexes in both ascending and descending order, the direction of a single-key index does not matter. Indexes support queries, update operations, and some phases of the aggregation pipeline.


3 Answers

I am going to speculate here a bit, and then a comment about your design.

First, when you create an index on key which has an array on a value you create a record for each element of the array:

To index a field that holds an array value, MongoDB creates an index key for each element in the array.

This is from MongoDB own documentation about indecies.

So, if your typical record more than a hand full of categories and you have 7 million records, your index is huge, and it will also take time to scan the index itself to find that the index does not contain what you are looking for. It is still faster than a collection scan, but it darn slow compared to how fast it takes to find an existing record.

Now, let me comment about your schema design. This is a matter of style so feel free to ignore this part.

You have a record which might be in 17 categories. That is a bit overwhelming, and over abusing the term category. A category is a specific division, a way to quickly associate a thing with a group of things. What is a thing that belong to so many groups? Let's take for example your records Cafe Ne. I assume in the real world - and please remember, programming and applications are at best when the solve real world problems - Cafe Ne, is either a restaurant, a caffe, a jazz bar, a dinner. It's for sure not a garage (unless, cafe means cars in a language I don't know). I can hardly imagine it's a bank or a dental clinic. I'd have to really make an effort, to find more than 10 meaningful categories, that users search a cafe by.

My point is, even though mongodb allows you to design things like that, it does not mean you have to. Try to narrow the number of categories you have and the ones you look for, and you will have much better performance.

like image 132
oz123 Avatar answered Oct 13 '22 18:10

oz123


As JohnnyHK suggested in comments, and Oz123 pointed to in his answer, the issue here appears to be an index that has grown so large that it fails to perform well as an index. I believe that in addition to the category expansion issue that has already been pointed out, the ordering of fields in your index creates trouble. Compound indexes are built according to the order of fields, and putting name after categoriesIds makes it more costly to query on name.

It's clear that you need to tune your indexes. Exactly how you tune them depends on the types of queries that you are expecting to support. In particularly, I'm not sure if you'll see better performance from a compound index of loc and name or if you'll see better performance from individual indexes, one for loc and one for name. Mongo themselves are a little vague on when it's best to use a compound index and when it's best to use individual indexes and rely on index intersection.

My intuition says that individual indexes will perform better, but I'd test both scenarios.

If you anticipate needing to query by category as well, without name or loc fields that could narrow the query down, it's probably best to create a separate categoriesIds index.

like image 29
Laizer Avatar answered Oct 13 '22 18:10

Laizer


The order of the fields in a compound index is very important. It's hard to diagnose without having access to the real data and usage patterns, but this key might increase the odds of matching (or not) the document using only the index:

{
  "loc" : "2dsphere",
  "name" : 1,
  "categoriesIds" : 1
}
like image 26
jbmartinez Avatar answered Oct 13 '22 20:10

jbmartinez