This sounds odd, and I hope I am doing something wrong, but my <code>MongoDB</code> collection is returning the <code>Count</code> off by one in my collection. I have a collection with (I am sure) 359671 documents. However the <code>count()</code> command returns 359670 documents. I am executing the <code>count()</code> command using the mongo shell: <pre class="prettyprint"><code>rs0:PRIMARY> db.COLLECTION.count() 359670 </code></pre> This is incorrect. It is not finding each and every document in my collection. If I provide the following query to count, I get the correct result: <pre class="prettyprint"><code>rs0:PRIMARY> db.COLLECTION.count({_id: {$exists: true}}) 359671 </code></pre> I believe this is a bug in WiredTiger. As far as I am aware each document has the same definition, an _id field of an integer ranging from 0 to 359670, and a BinData field. I did not have this problem with the older storage engine (or Mongo 2, either could have caused the issue). Is this something I have done wrong? I do not want to use the <code>{_id: {$exists: true}}</code> query as that takes 100x longer to complete.

As now stated in the doc, <code>db.collection.count()</code> without using a query parameter, returns results based on the collection’s metadata: <blockquote> This may result in an approximate count. In particular: <ul> <li>On a sharded cluster, the resulting count will not correctly filter out orphaned documents.</li> <li>After an unclean shutdown, the count may be incorrect.</li> </ul> </blockquote> When using a query parameter, as you did in the second query (<code>{_id: {$exists: true}}</code>), then it forces <code>count</code> to not use the collection's metadata, but to scan the collection instead. <hr> Starting <code>Mongo 4.0.3</code>, <code>count()</code> is considered deprecated and the following alternatives are recommended instead: <ul> <li>Exact count of douments:</li> </ul> <pre class="prettyprint lang-js prettyprint-override"><code>db.collection.countDocuments({}) </code></pre> which under the hood actually performs the following "expensive", but accurate aggregation (expensive since the whole collection is scanned to count records): <pre class="prettyprint lang-js prettyprint-override"><code>db.collection.aggregate([{ $group: { _id: null, n: { $sum: 1 } } }]) </code></pre> <ul> <li>Approximate count of documents:</li> </ul> <pre class="prettyprint lang-js prettyprint-override"><code>db.collection.estimatedDocumentCount() </code></pre> which performs exactly what <code>db.collection.count()</code> does/did (it's actually a wrapper around <code>count</code>), which uses the collection’s metadata. This is thus almost instantaneous, but may lead to an approximate result in the particular cases mentioned above.

Incorrect Count returned by MongoDB (WiredTiger)

Tags:

mongodb

wiredtiger

This sounds odd, and I hope I am doing something wrong, but my MongoDB collection is returning the Count off by one in my collection.

I have a collection with (I am sure) 359671 documents. However the count() command returns 359670 documents.

I am executing the count() command using the mongo shell:

rs0:PRIMARY> db.COLLECTION.count()
359670

This is incorrect.

It is not finding each and every document in my collection.

If I provide the following query to count, I get the correct result:

rs0:PRIMARY> db.COLLECTION.count({_id: {$exists: true}})
359671

I believe this is a bug in WiredTiger. As far as I am aware each document has the same definition, an _id field of an integer ranging from 0 to 359670, and a BinData field. I did not have this problem with the older storage engine (or Mongo 2, either could have caused the issue).

Is this something I have done wrong? I do not want to use the {_id: {$exists: true}} query as that takes 100x longer to complete.

369

asked Jun 08 '15 17:06

James

1 Answers

As now stated in the doc, db.collection.count() without using a query parameter, returns results based on the collection’s metadata:

This may result in an approximate count. In particular:

On a sharded cluster, the resulting count will not correctly filter out orphaned documents.

After an unclean shutdown, the count may be incorrect.

When using a query parameter, as you did in the second query ({_id: {$exists: true}}), then it forces count to not use the collection's metadata, but to scan the collection instead.

Starting Mongo 4.0.3, count() is considered deprecated and the following alternatives are recommended instead:

Exact count of douments:

db.collection.countDocuments({})

which under the hood actually performs the following "expensive", but accurate aggregation (expensive since the whole collection is scanned to count records):

db.collection.aggregate([{ $group: { _id: null, n: { $sum: 1 } } }])

Approximate count of documents:

db.collection.estimatedDocumentCount()

which performs exactly what db.collection.count() does/did (it's actually a wrapper around count), which uses the collection’s metadata.

This is thus almost instantaneous, but may lead to an approximate result in the particular cases mentioned above.

159

answered Oct 31 '22 12:10

Xavier Guihot

Related questions
                            
                                MongoDB - Filtering the content of an internal Array in a resultset
                            
                                Node.js mongodb set default safe variable
                            
                                How do you remove a model from mongoose?
                            
                                Export and reuse my mongoose connection across multiple models
                            
                                could not found bean for MongoRepository (Spring Boot)
                            
                                Mongodb + Atlas: 'bad auth Authentication failed.', code: 8000,
                            
                                How can I return an array of mongodb objects in pymongo (without a cursor)? Can MapReduce do this?
                            
                                Sails.js find multiple database entries by id
                            
                                Mongoose connection to replica set
                            
                                MEANJS : 413 (Request Entity Too Large)
                            
                                Push object into array if the array exists otherwise create the array with object in MongoDB
                            
                                How to migrate data from mongodb to mysql?
                            
                                MongoDb query array with null values
                            
                                Not authorized for query on admin.system.namespaces on mongodb
                            
                                how to prevent logging on console when connected to mongodb from java?
                            
                                mongoose .save() doesn't work
                            
                                connect robomongo to mongoDB docker container
                            
                                Mongo Atlas: Connection authentication failed with custom databases
                            
                                get virtual attribute for each nested object in an array of objects?
                            
                                Projection of mongodb subdocument using C# .NET driver 2.0

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With