Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mongodb - index on date is not used

Tags:

mongodb

Collection events has userId and an array of events-- each element in the array is an embedded document. Example:

{
    "_id" : ObjectId("4f8f48cf5f0d23945a4068ca"),
    "events" : [
            {
                    "eventType" : "profile-updated",
                    "eventId" : "247266",
                    "eventDate" : ISODate("1938-04-27T23:05:51.451Z"),
            },
           {
                   "eventType" : "login",
                   "eventId" : "64531",
                   "eventDate" : ISODate("1948-05-15T23:11:37.413Z"),
           }
    ],
    "userId" : "junit-19568842",

}

Using a query like the one below tofind events generated in last 30 days:

db.events.find( { events : { $elemMatch: { "eventId" : 201, 
"eventDate" : {$gt : new Date(1231657163876) } } } }  ).explain()

Query plan shows that index on "events.eventDate" is used when the test data contains fewer events (around 20):

{
    "cursor" : "BtreeCursor events.eventDate_1",
    "nscanned" : 0,
    "nscannedObjects" : 0,
    "n" : 0,
    "millis" : 0,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "isMultiKey" : true,
    "indexOnly" : false,
    "indexBounds" : {
            "events.eventDate" : [
                    [
                            ISODate("2009-01-11T06:59:23.876Z"),
                            ISODate("292278995-01--2147483647T07:12:56.808Z")
                    ]
            ]
    }

}

However, when there are large number of events (around 500), index is not used:

{
    "cursor" : "BasicCursor",
    "nscanned" : 4,
    "nscannedObjects" : 4,
    "n" : 0,
    "millis" : 0,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "isMultiKey" : false,
    "indexOnly" : false,
    "indexBounds" : {

    }

}

Why is the index not being used when there are a lot of events? May be when there are large number of events, MongoDB finds it is efficient just to scan all the items than using the index?

like image 337
dsatish Avatar asked Apr 18 '12 23:04

dsatish


People also ask

How do I know if my MongoDB index is working?

In MongoDB, you can use the cursor. explain() method or the db. collection. explain() method to determine whether or not a query uses an index.

Does MongoDB automatically index?

Each collection in MongoDB automatically has an index on the _id field. This index can then be used to fetch documents from the database efficiently. However, you will need to query data on other specific fields most of the time. This is where a single field index will come in handy.

Should I index a Date column?

Is it a good idea to have a index on the Timestamp column ? Yes, it is generally a good idea to have an index on a field used in a query criteria. This is really useful when there are a large number of documents (e.g., a million) in the collection. The index will be used to run the query fast.

How is Date saved in MongoDB?

Internally, Date objects are stored as a signed 64-bit integer representing the number of milliseconds since the Unix epoch (Jan 1, 1970). Not all database operations and drivers support the full 64-bit range. You may safely work with dates with years within the inclusive range 0 through 9999 .


1 Answers

MongoDB's query optimizer works in a special way. Rather than calculating cost of certain query plan, it just launches all available plans. Whichever returns first is considered optimal one and will be used in the future.

Application grows, data grows and changes, optimal plan may become not optimal at some point. So, mongo repeats that query selection process every once in a while.

It appears that in this concrete case, basic scan was the most efficient.

Link: http://www.mongodb.org/display/DOCS/Query+Optimizer

like image 160
Sergio Tulentsev Avatar answered Sep 27 '22 21:09

Sergio Tulentsev