I wonder the count operation in mongodb for large collection.
When I do db.collection.count()
without any criteria in a collection with ten million rows, it just take no more than 100ms.
Why so fast, is it just read from the stat value of db.collection.stats().count
?
A count()
without a query predicate is a fast/estimated count read from the collection statistics, so should return in constant time regardless of the total number of documents. This is the same value returned by db.collection.stats().count
and the $collStats
aggregation stage (MongoDB 3.6+). Note: with the WiredTiger storage engine, collection statistics are persisted periodically and may not be accurate after unclean shutdown.
A count()
with query criteria will provide a more accurate result, but will need to iterate a suitable index (or perform a collection scan if there are no candidate indexes).
You can confirm the query planning outcome by reviewing the explain()
output for a count.
The winning plan for a fast count only has a COUNT
stage:
> db.zipcodes.explain().count().queryPlanner.winningPlan
{ "stage" : "COUNT" }
The winning plan for a count using an index will have a COUNT_SCAN
input stage:
> db.zipcodes.explain().count({city:'Sydney'}).queryPlanner.winningPlan
{
"stage" : "COUNT",
"inputStage" : {
"stage" : "COUNT_SCAN",
"keyPattern" : {
"city" : 1
},
"indexName" : "city_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"city" : [ ]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"indexBounds" : {
"startKey" : {
"city" : "Sydney"
},
"startKeyInclusive" : true,
"endKey" : {
"city" : "Sydney"
},
"endKeyInclusive" : true
}
}
}
The winning plan for a count requiring a collection scan will have a COLLSCAN
input stage:
> db.zipcodes.explain().count({cityx:'Sydney'}).queryPlanner.winningPlan
{
"stage" : "COUNT",
"inputStage" : {
"stage" : "COLLSCAN",
"filter" : {
"cityx" : {
"$eq" : "Sydney"
}
},
"direction" : "forward"
}
}
The operation does not perform the query but instead counts the results that would be returned by the query. You can check this by looking into query plan for it using EXPLAIN
db.collection.explain("executionStats").count()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With