Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL queries on two different indexes fast, but combined into one query slow. Why?

I have a table with 2 million rows. I have two index (status, gender) and also (birthday).

I find strange that this query is taking 3.6 seconds or more QUERY N° 1

SELECT COUNT(*) FROM ts_user_core
WHERE birthday BETWEEN '1980-01-01' AND '1985-01-01'
    AND status='ok' AND gender='female';

same for this: QUERY N° 2

SELECT COUNT(*) FROM ts_user_core
WHERE status='ok' AND gender='female'
    AND birthday between '1980-01-01' AND '1985-01-01';

While this query is taking 0.140 seconds QUERY N° 3

select count(*) from ts_user_core where (birthday between '1990-01-01' and '2000-01-01');

Also this query takes 0.2 seconds QUERY N° 4

select count(*) from ts_user_core where status='ok' and gender='female'

I expect the first query to be way more faster, how can be possible this behavior? I can't handle so much time for this query.

Here the result of: enter image description here

I know that I can add a new index with 3 columns, but is there a way to have a faster query without adding an index for every where clause?

Thanks for your advice

like image 488
Stefano Giacone Avatar asked Apr 28 '15 13:04

Stefano Giacone


People also ask

Why is my MySQL query running slow?

Queries can become slow for various reasons ranging from improper index usage to bugs in the storage engine itself. However, in most cases, queries become slow because developers or MySQL database administrators neglect to monitor them and keep an eye on their performance.

Why does index query slow down?

Having two identical indexes makes a negative impact on the performance of SQL queries. It is actually a waste of disk space and also slows down the insertions to the table. Therefore, it is a good practice to avoid duplicate indexes to eliminate these issues. Duplication of indexes can happen in multiple ways.

Does group by Make query slower?

GROUP BY is a powerful statement, but it tends to slow down queries.


1 Answers

is there a way to optimize the query without adding an index for every possible where clause?

Yes, somewhat. But it takes an understanding of how INDEXes work.

Let's look at all the SELECTs you have presented so far.

  1. To build the optimal index for a SELECT, start with all the = constant items in the WHERE clause. Put those columns into an index in any order. That gives us INDEX(status, gender, ...) or INDEX(gender, status, ...), but nothing deciding between them (yet).
  2. add on one range or all the ORDER BY. In your first couple of SELECTs, that would be birthday. Now we have INDEX(status, gender, birthday) or INDEX(gender, status, birthday). Either of these is 'best' for the first two SELECTs.

Those indexes work quite well for #4: select count(*) from ts_user_core where status='ok' and gender='female', too. So no extra index needed for it.

Now, let's work on #3: select count(*) from ts_user_core where (birthday between '1990-01-01' and '2000-01-01');

  • It cannot use the indexes we have so far.
  • INDEX(birthday) is essentially the only choice.

Now, suppose we also had ... WHERE status='foo'; (without gender). That would force us to pick INDEX(status, gender, birthday) instead of the variant of it.

Result: 2 good indexes to handle all 5 selects:

INDEX(status, gender, birthday)
INDEX(birthday)

Suggestion: If you end up with more than 5 INDEXes or an index with more than 5 columns in it, it is probably wise to shorten some indexes. Here is where things get really fuzzy. If you would like to present me with a dozen 'realistic' indexes, I'll walk you through it.

Notes on other comments:

  • For timing, run each query twice and take the second time -- to avoid caching effects. (Your 3.6 vs 0.140 smells like caching of the index.)
  • For timing, turn off the Query cache or use SQL_NO_CACHE.
  • The optimizer rarely uses two indexes in a single query.
  • Show us the EXPLAIN plain; we can help you read it.
  • The extra time taken to pick among multiple INDEXes is usually worth it.
  • If you have INDEX(a,b,c), you don't need INDEX(a,b).
like image 189
Rick James Avatar answered Oct 29 '22 16:10

Rick James