Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB db.collection.count() vs db.collection.find().length()

Tags:

mongodb

I would like to understand why these commands, when run from a mongos instance against the same MongoDB collection, return different numbers?

  • db.users.count()
  • db.users.find().length()

What can be the reason and can it be a sign of underlying issues?

like image 269
Marius Cotofana Avatar asked Oct 31 '22 07:10

Marius Cotofana


1 Answers

I believe your collection is sharded.

Most sharded databases solutions have such discrepancy, due to the fact that some commands consider the entire collection, meaning all the documents of all the shards, while some other commands only consider the documents of the shard it is connected to.

This is something to always keep in mind. It mostly applies to commands which:

  • count
  • return the document having the lowest value for a given field
  • return the document having the biggest value for a given field
  • ...

Found on Mongo docs:

count() is equivalent to the db.collection.find(query).count() construct. ... Sharded Clusters

On a sharded cluster, db.collection.count() can result in an inaccurate count if orphaned documents exist or if a chunk migration is in progress. ...

So in the case of Mongo, it is simply because Mongo always runs, in a background process, some rebalancing of the documents within a shard, in order to keep the shards distribution compliant with the sharding policy defined on the collection.

Keep in mind that to offer the best performance, most sharded solutions will write the documents on the shard the client is connected to, and then later put it where it is really meant to be.

This is why nosql DBs are often flagged as eventually consistent.

like image 66
DevLounge Avatar answered Nov 15 '22 06:11

DevLounge