I have a question about this query:
SELECT *
FROM runs
WHERE (NOW() BETWEEN began_at
AND finished_at)
Do you think it makes sense to create composite index for began_at and finished_at columns? Or it makes sense to create index only for began_at?
Your style is very uncommon.
Most people would probably write WHERE began_at < NOW() AND finished_at > NOW()
However. I would recommend putting an index on both fields.
A combined key wont be of use to you because you it would only speed up searcher for specific date combinations.
Well this is not entirely true because if you use betree a combined key will help you but not as good as if you index them seperately. Combined keys are very good if you search combinations of fields with equality (=) operator. SIngle field indexes perform better in ragen requests.
You can google a bit for "multidimensional range search".
The reason is that all matching fields in one field can be basically found in log(n) time in btrees. So your overall runtime will be O(k*log(n)) which is O(log(n)).
Multidimensional Range queries have a runtime of O(sqrt(n)) which is higher. However there are better implementations as well which also acheav logarithmic runtime. However they are not fully implemented in mysql, so it will be worse or awful depending on the version.
So let me sum up:
Equality comparisions on single fields: hash index (runtime O(1))
Range search on single fields: btree index on single fields ( O(log(n)) )
Equality search on multiple fields: combined hash key (runtime O(1))
those cases are a clear thing...
this is where its not so clear. with current versions its clearly better to index seperately because of the reasons given above. With a perfect implementation for that use case you could achieve better performance with combined keys but there is no system in know of which supports it. mysql supports loose indexes (which you need for that) since version 5.0, but only very limited and the query optimizer only utilizes them in rare cases afaik. don't know about newer versions like 5.3 or something.
however with mysql implementing loose indexes combined keys on fields where you do range requests or sorting in different directions become more and more relevant.
Due to the use of inequalities, and not equalities, a composite index isn't going to do any much better (if not worse) than two individual indexes.
I'd advocate for leaning towards two individual indexes on both began_at
and finished_at
.
References for Loose index scan:
http://www.mysqlperformanceblog.com/2006/05/09/descending-indexing-and-loose-index-scan/
http://dev.mysql.com/doc/refman/5.5/en/loose-index-scan.html
The "Index Merge" strategy could come into play from MySQL 5 onwards: http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html - which also suggests that separate indexes might be better.
However, I have never been able to get it to work for me :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With