Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Incorrect Count returned by MongoDB (WiredTiger)

This sounds odd, and I hope I am doing something wrong, but my MongoDB collection is returning the Count off by one in my collection.

I have a collection with (I am sure) 359671 documents. However the count() command returns 359670 documents.

I am executing the count() command using the mongo shell:

rs0:PRIMARY> db.COLLECTION.count()
359670

This is incorrect.

It is not finding each and every document in my collection.

If I provide the following query to count, I get the correct result:

rs0:PRIMARY> db.COLLECTION.count({_id: {$exists: true}})
359671

I believe this is a bug in WiredTiger. As far as I am aware each document has the same definition, an _id field of an integer ranging from 0 to 359670, and a BinData field. I did not have this problem with the older storage engine (or Mongo 2, either could have caused the issue).

Is this something I have done wrong? I do not want to use the {_id: {$exists: true}} query as that takes 100x longer to complete.

like image 369
James Avatar asked Jun 08 '15 17:06

James


People also ask

How do I count records in MongoDB?

count() method is used to return the count of documents that would match a find() query. The db. collection. count() method does not perform the find() operation but instead counts and returns the number of results that match a query.

What is count () in MongoDB?

The count() method counts the number of documents that match the selection criteria. It returns the number of documents that match the selection criteria. It takes two arguments first one is the selection criteria and the other is optional.

How do you use count in Pymongo?

Method 2: count_documents() Alternatively, you can also use count_documents() function in pymongo to count the number of documents present in the collection. Example: Retrieves the documents present in the collection and the count of the documents using count_documents().

How do I count documents in MongoDB Atlas?

The Atlas Search count option adds a field to the metadata results document that displays a count of the search results for the query. You can use count to determine the size of the result set. You can use it in the $search or $searchMeta stage.


1 Answers

As now stated in the doc, db.collection.count() without using a query parameter, returns results based on the collection’s metadata:

This may result in an approximate count. In particular:

  • On a sharded cluster, the resulting count will not correctly filter out orphaned documents.

  • After an unclean shutdown, the count may be incorrect.

When using a query parameter, as you did in the second query ({_id: {$exists: true}}), then it forces count to not use the collection's metadata, but to scan the collection instead.


Starting Mongo 4.0.3, count() is considered deprecated and the following alternatives are recommended instead:

  • Exact count of douments:
db.collection.countDocuments({})

which under the hood actually performs the following "expensive", but accurate aggregation (expensive since the whole collection is scanned to count records):

db.collection.aggregate([{ $group: { _id: null, n: { $sum: 1 } } }])
  • Approximate count of documents:
db.collection.estimatedDocumentCount()

which performs exactly what db.collection.count() does/did (it's actually a wrapper around count), which uses the collection’s metadata.

This is thus almost instantaneous, but may lead to an approximate result in the particular cases mentioned above.

like image 159
Xavier Guihot Avatar answered Oct 31 '22 12:10

Xavier Guihot