Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mongodb index strategy for range query with different fields

Almoust all my documents include 2 fields, start timestamp and finish timestamp. And in each my query I need to get elements which is in selected period of time. so start should be after selected value and final should be before selected timestamp.

query looks like

db.collection.find({start:{$gt:DateTime(...)}, final:{$lt:DateTime(...)}})

So what the best indexing strategy for that scenario?


By the way, which is better for performance - to store date as datetimes or as unix timestamps, which is long value itself

like image 584
silent_coder Avatar asked Mar 24 '13 17:03

silent_coder


People also ask

Which is the index used for multiple fields in MongoDB?

MongoDB also supports user-defined indexes on multiple fields, i.e. compound indexes. The order of fields listed in a compound index has significance. For instance, if a compound index consists of { userid: 1, score: -1 } , the index sorts first by userid and then, within each userid value, sorts by score .

Can a query use multiple indexes MongoDB?

MongoDB can use the intersection of multiple indexes to fulfill queries. In general, each index intersection involves two indexes; however, MongoDB can employ multiple/nested index intersections to resolve a query.

Does MongoDB index all fields?

MongoDB provides complete support for indexes on any field in a collection of documents. By default, all collections have an index on the _id field, and applications and users may add additional indexes to support important queries and operations. This document describes ascending/descending indexes on a single field.

How does MongoDB decide which index to use?

MongoDB uses multikey indexes to index the content stored in arrays. If you index a field that holds an array value, MongoDB creates separate index entries for every element of the array. These multikey indexes allow queries to select documents that contain arrays by matching on element or elements of the arrays.


1 Answers

Too add a little more to baloo's answer.

On the time-stamp vs. long issue. Generally the MongoDB server will not see a difference. The BSON encoding length is the same (64 bits). You may see a performance different on the client side depending on the driver's encoding. As an example, on the Java side a using the 10gen driver a time-stamp is rendered as Date that is a lot heavier than Long. There are drivers that try to avoid that overhead.

The other issue is that you will see a performance improvement if you close the range for the first field of the index. So if you use the index suggested by baloo:

db.collection.ensureIndex({start: 1, final: 1})

You query will perform (potentially much) better if you query is:

db.collection.find({start:{$gt:DateTime(...),$lt:DateTime(...)}, 
                    final:{$lt:DateTime(...)}})

Conceptually, if you think of the indexes as a a tree the closed range limits both sides of the tree instead of just one side. Without the closed range the server has to "check" all of the entries with a start greater than the time stamp provided since it does not know of the relation between start and final.

You may even find that that the query performance is no better using a single field index like:

db.collection.ensureIndex({start: 1})

Most of the savings is from the first field's pruning. The case where this will not be the case is when the query is covered by the index or the ordering/sort for the results can be derived from the index.

HTH - Rob.

like image 181
Rob Moore Avatar answered Sep 23 '22 16:09

Rob Moore