Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is MySQL slow when using LIMIT in my query?

I'm trying to figure out why is one of my query slow and how I can fix it but I'm a bit puzzled on my results.

I have an orders table with around 80 columns and 775179 rows and I'm doing the following request :

SELECT * FROM orders WHERE id_state = 2 AND id_mp IS NOT NULL ORDER BY creation_date DESC LIMIT 200

which returns 38 rows in 4.5s

When removing the ORDER BY I'm getting a nice improvement :

SELECT * FROM orders WHERE id_state = 2 AND id_mp IS NOT NULL LIMIT 200

38 rows in 0.30s

But when removing the LIMIT without touching the ORDER BY I'm getting an even better result :

SELECT * FROM orders WHERE id_state = 2 AND id_mp IS NOT NULL ORDER BY creation_date DESC

38 rows in 0.10s (??)

Why is my LIMIT so hungry ?

GOING FURTHER

I was trying a few things before sending my answer and after noticing that I had an index on creation_date (which is a datetime) I removed it and the first query now runs in 0.10s. Why is that ?

EDIT

Good guess, I have indexes on the others columns part of the where.

mysql> explain SELECT * FROM orders WHERE id_state = 2 AND id_mp IS NOT NULL ORDER BY creation_date DESC LIMIT 200;
+----+-------------+--------+-------+------------------------+---------------+---------+------+------+-------------+
| id | select_type | table  | type  | possible_keys          | key        | key_len | ref  | rows | Extra       |
+----+-------------+--------+-------+------------------------+---------------+---------+------+------+-------------+
|  1 | SIMPLE      | orders | index | id_state_idx,id_mp_idx | creation_date | 5       | NULL | 1719 | Using where |
+----+-------------+--------+-------+------------------------+---------------+---------+------+------+-------------+

1 row in set (0.00 sec)

mysql> explain SELECT * FROM orders WHERE id_state = 2 AND id_mp IS NOT NULL ORDER BY creation_date DESC;
+----+-------------+--------+-------+------------------------+-----------+---------+------+-------+----------------------------------------------------+
| id | select_type | table  | type  | possible_keys          | key       | key_len | ref  | rows  | Extra                                              |
+----+-------------+--------+-------+------------------------+-----------+---------+------+-------+----------------------------------------------------+
|  1 | SIMPLE      | orders | range | id_state_idx,id_mp_idx | id_mp_idx | 3       | NULL | 87502 | Using index condition; Using where; Using filesort |
+----+-------------+--------+-------+------------------------+-----------+---------+------+-------+----------------------------------------------------+
like image 803
Alexandre Jacob Avatar asked Jul 19 '13 13:07

Alexandre Jacob


3 Answers

Indexes do not necessarily improve performance. To better understand what is happening, it would help if you included the explain for the different queries.

My best guess would be that you have an index in id_state or even id_state, id_mp that can be used to satisfy the where clause. If so, the first query without the order by would use this index. It should be pretty fast. Even without an index, this requires a sequential scan of the pages in the orders table, which can still be pretty fast.

Then when you add the index on creation_date, MySQL decides to use that index instead for the order by. This requires reading each row in the index, then fetching the corresponding data page to check the where conditions and return the columns (if there is a match). This reading is highly inefficient, because it is not in "page" order but rather as specified by the index. Random reads can be quite inefficient.

Worse, even though you have a limit, you still have to read the entire table because the entire result set is needed. Although you have saved a sort on 38 records, you have created a massively inefficient query.

By the way, this situation gets significantly worse if the orders table does not fit in available memory. Then you have a condition called "thrashing", where each new record tends to generate a new I/O read. So, if a page has 100 records on it, the page might have to be read 100 times.

You can make all these queries run faster by having an index on orders(id_state, id_mp, creation_date). The where clause will use the first two columns and the order by will use the last.

like image 191
Gordon Linoff Avatar answered Oct 05 '22 05:10

Gordon Linoff


Same problem happened in my project, I did some test, and found out that LIMIT is slow because of row lookups

See: MySQL ORDER BY / LIMIT performance: late row lookups

So, the solution is:

(A)when using LIMIT, select not all columns, but only the PK columns

(B)Select all columns you need, and then join with the result set of (A)

SQL should likes:

SELECT
    *
FROM
    orders O1   <=== this is what you want
JOIN
    (
        SELECT
            ID                         <== fetch the PK column only, this should be fast
        FROM
            orders
        WHERE
            [your query condition]     <== filter record by condition
        ORDER BY
            [your order by condition]  <== control the record order
        LIMIT 2000, 50                 <== filter record by paging condition
    ) as O2
ON
    O1.ID = O2.ID
ORDER BY
    [your order by condition]          <== control the record order

in my DB,

the old SQL which select all columns using "LIMIT 21560, 20", costs about 4.484s.

the new sql costs only 0.063s. The new one is about 71 times faster

like image 24
Li Ying Avatar answered Oct 05 '22 05:10

Li Ying


I had a similar issue on a table of 2.5 million records. Removing the limit part the query took a few seconds. With the limit part it stuck forever.

I solved with a subquery. In your case it would became:

SELECT * 
FROM 
    (SELECT * 
     FROM orders 
     WHERE id_state = 2 
       AND id_mp IS NOT NULL 
     ORDER BY creation_date DESC) tmp 
LIMIT 200

I noted that the original query was fast when the number of selected rows was greater than the limit parameter. Se the query became extremely slow when the limit parameter was useless.


Another solution is trying forcing index. In your case you can try with

SELECT * 
FROM orders force index (id_mp_idx) 
WHERE id_state = 2 
  AND id_mp IS NOT NULL 
ORDER BY creation_date DESC 
LIMIT 200
like image 40
Tama Avatar answered Oct 05 '22 05:10

Tama