Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# MongoDB driver [2.7.0] CountDocumentAsync unexpected native query

Tags:

I encountered weird thing when using C# MongoDB CountDocumentAsync function. I enabled query logging on MongoDB and this is what I got:

{
    "op" : "command",
    "ns" : "somenamespace",
    "command" : {
        "aggregate" : "reservations",
        "pipeline" : [
            {
                "some_query_key": "query_value"
            },
            {
                "$group" : {
                    "_id" : null,
                    "n" : {
                        "$sum" : 1
                    }
                }
            }
        ],
        "cursor" : {}
    },
    "keyUpdates" : 0,
    "writeConflicts" : 0,
    "numYield" : 9,
    "locks" : {
        "Global" : {
            "acquireCount" : {
                "r" : NumberLong(24)
            }
        },
        "Database" : {
            "acquireCount" : {
                "r" : NumberLong(12)
            }
        },
        "Collection" : {
            "acquireCount" : {
                "r" : NumberLong(12)
            }
        }
    },
    "responseLength" : 138,
    "protocol" : "op_query",
    "millis" : 2,
    "execStats" : {},
    "ts" : ISODate("2018-09-27T14:08:48.099Z"),
    "client" : "172.17.0.1",
    "allUsers" : [ ],
    "user" : ""
}

simple count is converted into an aggregate.

More interestingly when I use CountAsync function (which btw is marked obsolete with remark I should be using CountDocumentsAsync) it produces:

{
    "op" : "command",
    "ns" : "somenamespace",
    "command" : {
        "count" : "reservations",
        "query" : {
            "query_key": "query_value"
        }
    },
    "keyUpdates" : 0,
    "writeConflicts" : 0,
    "numYield" : 9,
    "locks" : {
        "Global" : {
            "acquireCount" : {
                "r" : NumberLong(20)
            }
        },
        "Database" : {
            "acquireCount" : {
                "r" : NumberLong(10)
            }
        },
        "Collection" : {
            "acquireCount" : {
                "r" : NumberLong(10)
            }
        }
    },
    "responseLength" : 62,
    "protocol" : "op_query",
    "millis" : 2,
    "execStats" : {

    },
    "ts" : ISODate("2018-09-27T13:58:27.758Z"),
    "client" : "172.17.0.1",
    "allUsers" : [ ],
    "user" : ""
}

which is what I would expect. Does anyone know what might be a reason for this behavior? I browsed documentation but didn't find anything interesting regarding it.

like image 421
djaszczurowski Avatar asked Sep 27 '18 14:09

djaszczurowski


People also ask

What C is used for?

C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...

Is C language easy?

Compared to other languages—like Java, PHP, or C#—C is a relatively simple language to learn for anyone just starting to learn computer programming because of its limited number of keywords.

What is the full name of C?

In the real sense it has no meaning or full form. It was developed by Dennis Ritchie and Ken Thompson at AT&T bell Lab. First, they used to call it as B language then later they made some improvement into it and renamed it as C and its superscript as C++ which was invented by Dr.

What is C language?

C is an imperative procedural language supporting structured programming, lexical variable scope, and recursion, with a static type system. It was designed to be compiled to provide low-level access to memory and language constructs that map efficiently to machine instructions, all with minimal runtime support.


1 Answers

This is the documented behaviour for drivers supporting 4.0 features. The reason for the change is to remove confusion and make it clear when an estimate is used and when it is not.

When counting based on a query filter (rather than just counting the entire collection) both methods will cause the server to iterate over matching documents to count them and therefore have similar performance.

From MongoDb docs: db.collection.count()

NOTE:

MongoDB drivers compatible with the 4.0 features deprecate their respective cursor and collection count() APIs in favor of new APIs for countDocuments() and estimatedDocumentCount(). For the specific API names for a given driver, see the driver documentation.

From MongoDb docs: db.collection.countDocuments()

db.collection.countDocuments(query, options)

New in version 4.0.3.

Returns the count of documents that match the query for a collection or view. The method wraps the $group aggregation stage with a $sum expression to perform the count and is available for use in Transactions.

A more detailed explanation for this change in API can be found on the MongoDb JIRA site:

Drivers supporting MongoDB 4.0 must deprecate the count() helper and add two new helpers - estimatedDocumentCount() and countDocuments(). Both helpers are supported with MongoDB 2.6+.

The names of the new helpers were chosen to make it clear how they behave and exactly what they do. The estimatedDocumentCount helper returns an estimate of the count of documents in the collection using collection metadata, rather than counting the documents or consulting an index. The countDocuments helper counts the documents that match the provided query filter using an aggregation pipeline.

The count() helper is deprecated. It has always been implemented using the count command. The behavior of the count command differs depending on the options passed to it and the topology in use and may or may not provide an accurate count. When no query filter is provided the count command provides an estimate using collection metadata. Even when provided with a query filter the count command can return inaccurate results with a sharded cluster if orphaned documents exist or if a chunk migration is in progress. The countDocuments helper avoids these sharded cluster problems entirely when used with MongoDB 3.6+, and when using Primary read preference with older sharded clusters.

like image 190
Chris Avatar answered Oct 11 '22 12:10

Chris