Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB Query for records with non-existant field & indexing

We have a mongo database with around 1M documents, and we want to poll this database using a processed field to find documents which we havent seen before. To do this we are setting a new field called _processed.

To query for documents which need to be processed, we query for documents which do not have this processed field:

db.stocktwits.find({ "_processed" : { "$exists" : false } })

However, this query takes around 30 seconds to complete each time, which is rather slow. There is an index (asc) which sits on the _processed field:

db.stocktwits.ensureIndex({ "_processed" : -1 },{ "name" : "idx_processed" });

Adding this index does not change query performance. There are a few other indexes sitting on the collection (namely the ID idx & a unique index of a couple of fields in each document).

The _processed field is a long, perhaps this should be changed to a bool to make things quicker?

We have tried using a $where query (i.e. $where : this._processed==null) to do the same thing as $exists : false and the performance is about the same (few secs slower which makes sense)...

Any ideas on what would be casusing the slow performance (or is it normal)? Does anyone have any suggestions on how to improve the query speed?

Cheers!

like image 587
NightWolf Avatar asked Jan 19 '23 16:01

NightWolf


1 Answers

Upgrading to 2.0 is going to do this for you:

From MongoDB.org:

Before v2.0, $exists is not able to use an index. Indexes on other fields are still used.

like image 122
joostdevries Avatar answered Jan 31 '23 06:01

joostdevries