I have a table with 2 million rows. I have two index (status, gender) and also (birthday).
I find strange that this query is taking 3.6 seconds or more QUERY N° 1
SELECT COUNT(*) FROM ts_user_core
WHERE birthday BETWEEN '1980-01-01' AND '1985-01-01'
AND status='ok' AND gender='female';
same for this: QUERY N° 2
SELECT COUNT(*) FROM ts_user_core
WHERE status='ok' AND gender='female'
AND birthday between '1980-01-01' AND '1985-01-01';
While this query is taking 0.140 seconds QUERY N° 3
select count(*) from ts_user_core where (birthday between '1990-01-01' and '2000-01-01');
Also this query takes 0.2 seconds QUERY N° 4
select count(*) from ts_user_core where status='ok' and gender='female'
I expect the first query to be way more faster, how can be possible this behavior? I can't handle so much time for this query.
Here the result of:
I know that I can add a new index with 3 columns, but is there a way to have a faster query without adding an index for every where clause?
Thanks for your advice
Queries can become slow for various reasons ranging from improper index usage to bugs in the storage engine itself. However, in most cases, queries become slow because developers or MySQL database administrators neglect to monitor them and keep an eye on their performance.
Having two identical indexes makes a negative impact on the performance of SQL queries. It is actually a waste of disk space and also slows down the insertions to the table. Therefore, it is a good practice to avoid duplicate indexes to eliminate these issues. Duplication of indexes can happen in multiple ways.
GROUP BY is a powerful statement, but it tends to slow down queries.
is there a way to optimize the query without adding an index for every possible where clause?
Yes, somewhat. But it takes an understanding of how INDEXes work.
Let's look at all the SELECTs
you have presented so far.
SELECT
, start with all the = constant
items in the WHERE
clause. Put those columns into an index in any order. That gives us INDEX(status, gender, ...)
or INDEX(gender, status, ...)
, but nothing deciding between them (yet).ORDER BY
. In your first couple of SELECTs
, that would be birthday
. Now we have INDEX(status, gender, birthday)
or INDEX(gender, status, birthday)
. Either of these is 'best' for the first two SELECTs
.Those indexes work quite well for #4: select count(*) from ts_user_core where status='ok' and gender='female'
, too. So no extra index needed for it.
Now, let's work on #3: select count(*) from ts_user_core where (birthday between '1990-01-01' and '2000-01-01');
INDEX(birthday)
is essentially the only choice.Now, suppose we also had ... WHERE status='foo';
(without gender
). That would force us to pick INDEX(status, gender, birthday)
instead of the variant of it.
Result: 2 good indexes to handle all 5 selects:
INDEX(status, gender, birthday)
INDEX(birthday)
Suggestion: If you end up with more than 5 INDEXes
or an index with more than 5 columns in it, it is probably wise to shorten some indexes. Here is where things get really fuzzy. If you would like to present me with a dozen 'realistic' indexes, I'll walk you through it.
Notes on other comments:
3.6
vs 0.140
smells like caching of the index.)SQL_NO_CACHE
.EXPLAIN
plain; we can help you read it.INDEX(a,b,c)
, you don't need INDEX(a,b)
.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With