I'm stuck with this simple prefix query. Although Mongo docs state that you can get pretty good performance by using the prefix regex format (/^a/), the query is pretty slow when I try to sort the results:
940 millis
db.posts.find({hashtags: /^noticias/ }).limit(15).sort({rank : -1}).hint('hashtags_1_rank_-1').explain()
{
"cursor" : "BtreeCursor hashtags_1_rank_-1 multi",
"isMultiKey" : true,
"n" : 15,
"nscannedObjects" : 142691,
"nscanned" : 142692,
"nscannedObjectsAllPlans" : 142691,
"nscannedAllPlans" : 142692,
"scanAndOrder" : true,
"indexOnly" : false,
"nYields" : 1,
"nChunkSkips" : 0,
"millis" : 934,
"indexBounds" : {
"hashtags" : [
[
"noticias",
"noticiat"
],
[
/^noticias/,
/^noticias/
]
],
"rank" : [
[
{
"$maxElement" : 1
},
{
"$minElement" : 1
}
]
]
},
"server" : "XRTZ048.local:27017"
}
However, the unsorted version of the same query is super fast:
0 millis
db.posts.find({hashtags: /^noticias/ }).limit(15).hint('hashtags_1_rank_-1').explain()
{
"cursor" : "BtreeCursor hashtags_1_rank_-1 multi",
"isMultiKey" : true,
"n" : 15,
"nscannedObjects" : 15,
"nscanned" : 15,
"nscannedObjectsAllPlans" : 15,
"nscannedAllPlans" : 15,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"hashtags" : [
[
"noticias",
"noticiat"
],
[
/^noticias/,
/^noticias/
]
],
"rank" : [
[
{
"$maxElement" : 1
},
{
"$minElement" : 1
}
]
]
},
"server" : "XRTZ048.local:27017"
}
The query is also fast if I remove the regex and sort:
0 millis
db.posts.find({hashtags: 'noticias' }).limit(15).sort({rank : -1}).hint('hashtags_1_rank_-1').explain()
{
"cursor" : "BtreeCursor hashtags_1_rank_-1",
"isMultiKey" : true,
"n" : 15,
"nscannedObjects" : 15,
"nscanned" : 15,
"nscannedObjectsAllPlans" : 15,
"nscannedAllPlans" : 15,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"hashtags" : [
[
"noticias",
"noticias"
]
],
"rank" : [
[
{
"$maxElement" : 1
},
{
"$minElement" : 1
}
]
]
},
"server" : "XRTZ048.local:27017"
}
It seems like using both regex and sort makes Mongo scan lots of records. However, sort is scanning just 15 if I don't use the regex. What's wrong here?
The scanAndOrder: true
in the explain output indicates that the query is having to retrieve the documents and then sort them in memory before the output is returned. This is an expensive operation, and will be having an impact on the performance of your query.
The existence of scanAndOrder: true
as well as the difference in nscanned
an n
in the explain output indicates that the query is not using an optimal index. In this case it appears to be needing to do a collection scan. You might be able to alleviate this issue by including the index keys in your sort
criteria. From my testing:
db.posts.find({hashtags: /^noticias/ }).limit(15).sort({hashtags:1, rank : -1}).explain()
Does not require a scan and order, and returns n
and nscanned
of the number of records you are looking for. This would also mean sorting on the hashtags
key, which may or may not be useful to you, but should increase the performance of the query.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With