Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimizing slow MySQL select query

EDIT: After looking at some of the answers here and hours of research, my team came to the conclusion there was most likely no way to optimize this further than the 4.5 seconds we were able to achieve (unless maybe with partitioning on offers_clicks, but that would have some ugly side-effects). Eventually, after lots of brainstorming, we decided to split both queries, create two sets of user ids (one from users table and one from offers_clicks), and compare them with set in Python. The set of ids from users table is still pulled from SQL, but we decided to move offers_clicks to Lucene and also added some caching on top of it, so that's where the other set of ids is now pulled from. The end result is that its down to about half a second with cache and 0.9s without cache.

Start of original post: I have trouble getting a query optimized. The first version of the query is fine, but the moment offers_clicks is joined in the 2nd query, the query becomes rather slow. Users table contains 10 million rows, offers_clicks contains 53 million rows.

Acceptable performance:

SELECT count(distinct(users.id)) AS count_1
FROM users USE index (country_2)
WHERE users.country = 'US'
  AND users.last_active > '2015-02-26';
1 row in set (0.35 sec)

Bad:

SELECT count(distinct(users.id)) AS count_1
FROM offers_clicks USE index (user_id_3), users USE index (country_2)
WHERE users.country = 'US'
  AND users.last_active > '2015-02-26'
  AND offers_clicks.user_id = users.id
  AND offers_clicks.date > '2015-02-14'
  AND offers_clicks.ranking_score < 3.49
  AND offers_clicks.ranking_score > 0.24;
1 row in set (7.39 sec)

Here's how it looks without specificying any indexes (even worse):

SELECT count(distinct(users.id)) AS count_1
FROM offers_clicks, users
WHERE users.country IN ('US')
  AND users.last_active > '2015-02-26'
  AND offers_clicks.user_id = users.id
  AND offers_clicks.date > '2015-02-14'
  AND offers_clicks.ranking_score < 3.49
  AND offers_clicks.ranking_score > 0.24;
1 row in set (17.72 sec)

Explain:

explain SELECT count(distinct(users.id)) AS count_1 FROM offers_clicks USE index (user_id_3), users USE index (country_2) WHERE users.country IN ('US') AND users.last_active > '2015-02-26' AND offers_clicks.user_id = users.id AND offers_clicks.date > '2015-02-14' AND offers_clicks.ranking_score < 3.49 AND offers_clicks.ranking_score > 0.24;
+----+-------------+---------------+-------+---------------+-----------+---------+------------------------------+--------+--------------------------+
| id | select_type | table         | type  | possible_keys | key       | key_len | ref                          | rows   | Extra                    |
+----+-------------+---------------+-------+---------------+-----------+---------+------------------------------+--------+--------------------------+
|  1 | SIMPLE      | users         | range | country_2     | country_2 | 14      | NULL                         | 245014 | Using where; Using index |
|  1 | SIMPLE      | offers_clicks | ref   | user_id_3     | user_id_3 | 4       | dejong_pointstoshop.users.id | 270153 | Using where; Using index |
+----+-------------+---------------+-------+---------------+-----------+---------+------------------------------+--------+--------------------------+

Explain without specifying any indexes:

mysql> explain SELECT count(distinct(users.id)) AS count_1 FROM offers_clicks, users WHERE users.country IN ('US') AND users.last_active > '2015-02-26' AND offers_clicks.user_id = users.id AND offers_clicks.date > '2015-02-14' AND offers_clicks.ranking_score < 3.49 AND offers_clicks.ranking_score > 0.24;
+----+-------------+---------------+-------+------------------------------------------------------------------------+-----------+---------+------------------------------+--------+--------------------------+
| id | select_type | table         | type  | possible_keys                                                          | key       | key_len | ref                          | rows   | Extra                    |
+----+-------------+---------------+-------+------------------------------------------------------------------------+-----------+---------+------------------------------+--------+--------------------------+
|  1 | SIMPLE      | users         | range | PRIMARY,last_active,country,last_active_2,country_2                    | country_2 | 14      | NULL                         | 221606 | Using where; Using index |
|  1 | SIMPLE      | offers_clicks | ref   | user_id,user_id_2,date,date_2,date_3,ranking_score,user_id_3,user_id_4 | user_id_2 | 4       | dejong_pointstoshop.users.id |      3 | Using where              |
+----+-------------+---------------+-------+------------------------------------------------------------------------+-----------+---------+------------------------------+--------+--------------------------+

Here's a whole bunch of indexes I tried with not too much success:

+---------------+------------+-----------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table         | Non_unique | Key_name                    | Seq_in_index | Column_name     | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------------+------------+-----------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| offers_clicks |          1 | user_id_3                   |            1 | user_id         | A         |         198 |     NULL | NULL   |      | BTREE      |         |               |
| offers_clicks |          1 | user_id_3                   |            2 | ranking_score   | A         |         198 |     NULL | NULL   |      | BTREE      |         |               |
| offers_clicks |          1 | user_id_3                   |            3 | date            | A         |         198 |     NULL | NULL   |      | BTREE      |         |               |
| offers_clicks |          1 | user_id_2                   |            1 | user_id         | A         |    17838712 |     NULL | NULL   |      | BTREE      |         |               |
| offers_clicks |          1 | user_id_2                   |            2 | date            | A         |    53516137 |     NULL | NULL   |      | BTREE      |         |               |
| offers_clicks |          1 | user_id_4                   |            1 | user_id         | A         |         198 |     NULL | NULL   |      | BTREE      |         |               |
| offers_clicks |          1 | user_id_4                   |            2 | date            | A         |         198 |     NULL | NULL   |      | BTREE      |         |               |
| offers_clicks |          1 | user_id_4                   |            3 | ranking_score   | A         |         198 |     NULL | NULL   |      | BTREE      |         |               |
| users         |          1 | country_2                   |            1 | country         | A         |          14 |     NULL | NULL   |      | BTREE      |         |               |
| users         |          1 | country_2                   |            2 | last_active     | A         |     8048529 |     NULL | NULL   |      | BTREE      |         |               |

Simplified users schema:

+---------------------------------+---------------+------+-----+---------------------+----------------+
| Field                           | Type          | Null | Key | Default             | Extra          |
+---------------------------------+---------------+------+-----+---------------------+----------------+
| id                              | int(11)       | NO   | PRI | NULL                | auto_increment |
| country                         | char(2)       | NO   | MUL |                     |                |
| last_active                     | datetime      | NO   | MUL | 2000-01-01 00:00:00 |                |

Simplified offers clicks schema:

+-----------------+------------------+------+-----+---------------------+----------------+
| Field           | Type             | Null | Key | Default             | Extra          |
+-----------------+------------------+------+-----+---------------------+----------------+
| id              | int(11)          | NO   | PRI | NULL                | auto_increment |
| user_id         | int(11)          | NO   | MUL | 0                   |                |
| offer_id        | int(11) unsigned | NO   | MUL | NULL                |                |
| date            | datetime         | NO   | MUL | 0000-00-00 00:00:00 |                |
| ranking_score   | decimal(5,2)     | NO   | MUL | 0.00                |                |
like image 336
Mathijs de Jong Avatar asked Mar 08 '15 21:03

Mathijs de Jong


People also ask

How can you increase the performance of MySQL SELECT query?

Adjust the size and properties of the memory areas that MySQL uses for caching. With efficient use of the InnoDB buffer pool, MyISAM key cache, and the MySQL query cache, repeated queries run faster because the results are retrieved from memory the second and subsequent times.


1 Answers

This is your query:

SELECT count(distinct u.id) AS count_1
FROM offers_clicks oc JOIN
     users u
     ON oc.user_id = u.id
WHERE u.country IN ('US') AND u.last_active > '2015-02-26' AND
      oc.date > '2015-02-14' AND
      oc.ranking_score > 0.24 AND oc.ranking_score < 3.49;

First, instead of count(distinct), you might consider writing the query as:

SELECT count(*) AS count_1
FROM users u
WHERE u.country IN ('US') AND u.last_active > '2015-02-26' AND
      EXISTS (SELECT 1
              FROM offers_clicks oc
              WHERE oc.user_id = u.id AND
                    oc.date > '2015-02-14' AND
                    oc.ranking_score > 0.24 AND oc.ranking_score < 3.49
             )

Then, the best indexes for this query are: users(country, last_active, id) and either offers_clicks(user_id, date, ranking_score) or offers_clicks(user_id, ranking_score, date).

like image 171
Gordon Linoff Avatar answered Oct 12 '22 19:10

Gordon Linoff