I have a large MyISAM table. It's approaching 1 million rows. It's basically a list of items and some information about them.
There are two indices:
I run two queries:
SELECT * FROM table WHERE date = '2011-02-01' AND col < 5 LIMIT 10
SELECT * FROM table WHERE date < '2011-02-01' AND col < 5 LIMIT 10
The first one finishes in ~0.0005 seconds and the second in ~0.05 seconds. That is 100X difference. Is it wrong for me to expect both of these to run at roughly the same speed? I must not be understanding the indices very well. How can I speed up the second query?
Regardless of Mysql it boils down to basic algorithm theory.
Greater than and Less than operations on a large set are slower than Identity operations. With a large data set an ideal data structure for determining less than or greater is a self balancing tree (binary or n-tree). On a a self balanced tree the worst case scenario to find all less/greater is log n.
The ideal data structure for identity lookup is a hashtable. The performance of hashtables is generally O(1) aka fixed time. A hashtable however is not good for greater/less.
Generally a well balanced tree is only slightly less performing than a hashtable (which is how Haskell gets away with using a tree for hashtables).
Thus irregardless of what Mysql does its not surprise that <,> is slower than =
Old Answer below:
Because the first one is like Hashtable lookup since its '=' (particularly if your index is a hashtable) it will be faster than the second one which might work better with a tree like index.
Since MySql allows to configure the index format you can try changing that but I'm rather sure the first will always run faster than the second.
I'm assuming you have an index on the date column. The first query uses the index, the second query probably does a linear scan (at least over part of the data). A direct fetch is always faster than a linear scan.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With