Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance of count for mongodb

Tags:

mongodb

I wonder the count operation in mongodb for large collection.

When I do db.collection.count() without any criteria in a collection with ten million rows, it just take no more than 100ms.

Why so fast, is it just read from the stat value of db.collection.stats().count?

like image 545
hongshuwei Avatar asked Apr 23 '18 14:04

hongshuwei


Video Answer


2 Answers

A count() without a query predicate is a fast/estimated count read from the collection statistics, so should return in constant time regardless of the total number of documents. This is the same value returned by db.collection.stats().count and the $collStats aggregation stage (MongoDB 3.6+). Note: with the WiredTiger storage engine, collection statistics are persisted periodically and may not be accurate after unclean shutdown.

A count() with query criteria will provide a more accurate result, but will need to iterate a suitable index (or perform a collection scan if there are no candidate indexes).

You can confirm the query planning outcome by reviewing the explain() output for a count.

The winning plan for a fast count only has a COUNT stage:

> db.zipcodes.explain().count().queryPlanner.winningPlan
{ "stage" : "COUNT" }

The winning plan for a count using an index will have a COUNT_SCAN input stage:

> db.zipcodes.explain().count({city:'Sydney'}).queryPlanner.winningPlan
{
    "stage" : "COUNT",
    "inputStage" : {
        "stage" : "COUNT_SCAN",
        "keyPattern" : {
            "city" : 1
        },
        "indexName" : "city_1",
        "isMultiKey" : false,
        "multiKeyPaths" : {
            "city" : [ ]
        },
        "isUnique" : false,
        "isSparse" : false,
        "isPartial" : false,
        "indexVersion" : 2,
        "indexBounds" : {
            "startKey" : {
                "city" : "Sydney"
            },
            "startKeyInclusive" : true,
            "endKey" : {
                "city" : "Sydney"
            },
            "endKeyInclusive" : true
        }
    }
}

The winning plan for a count requiring a collection scan will have a COLLSCAN input stage:

> db.zipcodes.explain().count({cityx:'Sydney'}).queryPlanner.winningPlan
{
    "stage" : "COUNT",
    "inputStage" : {
        "stage" : "COLLSCAN",
        "filter" : {
            "cityx" : {
                "$eq" : "Sydney"
            }
        },
        "direction" : "forward"
    }
}
like image 136
Stennie Avatar answered Oct 17 '22 10:10

Stennie


The operation does not perform the query but instead counts the results that would be returned by the query. You can check this by looking into query plan for it using EXPLAIN

db.collection.explain("executionStats").count()

like image 44
Rajesh Mehta Avatar answered Oct 17 '22 08:10

Rajesh Mehta