Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB not using compound index on '_id'

I have a collection in MongoDB which has following documents.

/* 0 */
{
    "T" : [ 
        374135056604448742
    ],
    "_id" : {
        "#" : 7778532275691,
        "ts" : ISODate("2013-07-26T02:25:00Z")
    }
}

/* 1 */
{
    "T" : [ 
        1056188940167152853
    ],
    "_id" : {
        "#" : 34103385525388,
        "ts" : ISODate("2013-07-30T03:00:00Z")
    }
}

/* 2 */
{
    "T" : [ 
        1056188940167152853
    ],
    "_id" : {
        "#" : 34103385525388,
        "ts" : ISODate("2013-07-30T03:18:00Z")
    }
}

Now, I'm trying to query some documents with following query.

db.entries.find({
    '_id.ts': {'$gte': beginTS, '$lte': endTS}, 
    '_id.#' : 884327843395156951
    }).hint([('_id', 1)]).explain()

According to my understanding, since _id is a compound field, and Mongo always maintains a index on _id, hence to answer above query, Mongo should have used the index on '_id'. However, the answer to the above query is as following:

{u'allPlans': [{u'cursor': u'BtreeCursor _id_',
   u'indexBounds': {u'_id': [[{u'$minElement': 1}, {u'$maxElement': 1}]]},
   u'n': 2803,
   u'nscanned': 4869528,
   u'nscannedObjects': 4869528}],
 u'cursor': u'BtreeCursor _id_',
 u'indexBounds': {u'_id': [[{u'$minElement': 1}, {u'$maxElement': 1}]]},
 u'indexOnly': False,
 u'isMultiKey': False,
 u'millis': 128415,
 u'n': 2803,
 u'nChunkSkips': 0,
 u'nYields': 132,
 u'nscanned': 4869528,
 u'nscannedAllPlans': 4869528,
 u'nscannedObjects': 4869528,
 u'nscannedObjectsAllPlans': 4869528,
 u'scanAndOrder': False,

As it can be observed, MongoDB is doing an entire scan of DB to find just handful of documents. I don't know what the hell is wrong here.

I tried changing the order of query, but same result. I have no idea what is happening here. Any help if deeply appreciated.

UPDATE

I understood the nuance here. The _id is not a compound index, it's a mere exact index. This means that if _id is a document then irrespective of the structure of document and how many nested attrs or sub-documents it may have, the _id index will only contain one entry for the _id field. This entry is suppose to be hash of _id document and will be maintained unique.

like image 602
VaidAbhishek Avatar asked Aug 22 '13 18:08

VaidAbhishek


1 Answers

You are using an object as a key, but you're not using a compund index here.

The _id index is a bit special, because it is created automatically and is always unique. Normally, the _id index is an ObjectId, a UUID or maybe an integer or a string that contains some kind of hash. MongoDB supports complex objects as keys. However, to MongoDB, this is still just a document. It can be compared to other documents, and documents that have the same fields and values will be equal. But since you didn't create the index keys (and you can't create that index manually), MongoDB has no idea that it contains a field # and a field ts.

A compound index, on the other hand, refers to the fields of a document explicitly, e.g. {"product.quantity" : 1, "product.created" : -1}. This must be specified when the index is created.

It seems you're trying to basically store a timestamp in your primary key. MongoDB's ObjectId already contains a timestamp, so you can do date-based range queries on ObjectIds directly.

like image 173
mnemosyn Avatar answered Oct 15 '22 14:10

mnemosyn