I have a compound _id containing 3 numeric properties:
_id": { "KeyA": 0, "KeyB": 0, "KeyC": 0 }
The database in question has 2 million identical values for KeyA and clusters of 500k identical values for KeyB.
My understanding is that I can efficiently query for KeyA and KeyB using the command:
find( { "_id.KeyA" : 1, "_id.KeyB": 3 } ).limit(100)
When I explain this query the result is:
"cursor" : "BasicCursor",
"nscanned" : 1000100,
"nscannedObjects" : 1000100,
"n" : 100,
"millis" : 1592,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {}
Without the limit() the result is:
"cursor" : "BasicCursor",
"nscanned" : 2000000,
"nscannedObjects" : 2000000,
"n" : 500000,
"millis" : 3181,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {}
As I understand it BasicCursor means that index has been ignored and both queries have a high execution time - even when I've only requested 100 records it takes ~1.5 seconds. It was my intention to use the limit to implement pagination but this is obviously too slow.
The command:
find( { "_id.KeyA" : 1, "_id.KeyB": 3, , "_id.KeyC": 1000 } )
Correctly uses the BtreeCursor and executes quickly suggesting the compound _id is correct.
I'm using the release 1.8.3 of MongoDb. Could someone clarify if I'm seeing the expected behaviour or have I misunderstood how to use/query the compound index?
Thanks, Paul.
The index is not a compound index, but an index on the whole value of the _id
field. MongoDB does not look into an indexed field, and instead uses the raw BSON representation of a field to make comparisons (if I read the docs correctly).
To do what you want you need an actual compound index over {_id.KeyA: 1, _id.KeyB: 1, _id.KeyC: 1}
(which also should be a unique index). Since you can't not have an index on _id
you will probably be better off leaving it as ObjectId
(that will create a smaller index and waste less space) and keep your KeyA
, KeyB
and KeyC
fields as properties of your document. E.g. {_id: ObjectId("xyz..."), KeyA: 1, KeyB: 2, KeyB: 3}
You would need a separate compound index for the behavior you desire. In general I recommend against using objects as _id because key order is significant in comparisons, so {a:1, b:1} does not equal {b:1, a:1}. Since not all drivers preserve key order in objects it is very easy to shoot yourself in the foot by doing something like this:
db.foo.save(db.foo.findOne())
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With