Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Different explain plan in same cluster

Tags:

mysql

percona

I have a problem with this query:

SELECT 
    uca.user_activity_id,
    uca.user_call_id,
    uca.call_activity_id,
    uca.user_activity_token,
    uc.call_group_id,
    uc.user_id
FROM users_calls_activities uca
INNER JOIN users_calls_activities uca2 ON uca2.user_activity_id = uca.user_activity_is_validated_with
    AND aux.user_call_id = 1744136
INNER JOIN users_calls uc ON uc.user_call_id = uca.user_call_id;

We have a cluster with percona server (5.6.29) with 5 nodes(from 0 to 4) in Azure. The difference between nodes 0-3 and 4 is that, the first ones are in a balancer and the node 4 is out of the balancer (but in the cluster)

The problem is that in four of the servers (nodes 0-3) the query is really slow (15 sec) and in the other one (node 4)the query is really fast (0,002)

Afaik, the explain plan should be the same but i execute an EXPLAIN and the result is this:

Nodes 0-3 (Slow)

+----+-------------+-------+------+-------------------------------------------------------------+--------------+---------+-------------------------------+---------+---------------------------------------+
| id | select_type | table | type | possible_keys                                               | key          | key_len | ref                           | rows    | Extra                                 |
+----+-------------+-------+------+-------------------------------------------------------------+--------------+---------+-------------------------------+---------+---------------------------------------+
|  1 | SIMPLE      | uca2  | ref  | PRIMARY,user_call_id,user_call_id_2                         | user_call_id | 4       | const                         |       1 | Using index                           |
|  1 | SIMPLE      | uc    | ALL  | PRIMARY,user_call_id                                        | NULL         | NULL    | NULL                          | 2098152 | Using join buffer (Block Nested Loop) |
|  1 | SIMPLE      | uca   | ref  | user_call_id,user_call_id_2,is_validated_with               | user_call_id | 4       | db.uc.user_call_id            |       1 | Using where                           |
+----+-------------+-------+------+-------------------------------------------------------------+--------------+---------+-------------------------------+---------+---------------------------------------+

Node 4 (Fast)

+----+-------------+-------+--------+-------------------------------------------------------------+---------------------------------+---------+-----------------------------------+---------+-----------------------+
| id | select_type | table | type   | possible_keys                                               | key                             | key_len | ref                               | rows    | Extra                 |
+----+-------------+-------+--------+-------------------------------------------------------------+---------------------------------+---------+-----------------------------------+---------+-----------------------+
|  1 | SIMPLE      | uca2  | ref    | PRIMARY,user_call_id,user_call_id_2                         | user_call_id                    | 4       | const                             |       1 | Using index           |
|  1 | SIMPLE      | uca   | ref    | user_call_id,user_call_id_2,is_validated_with               | is_validated_with               | 5       | db.uc2.user_activity_id           | 2755595 | Using index condition |
|  1 | SIMPLE      | uc    | eq_ref | PRIMARY,user_call_id                                        | PRIMARY                         | 4       | db.uca.user_call_id               |       1 | NULL                  |
+----+-------------+-------+--------+-------------------------------------------------------------+---------------------------------+---------+-----------------------------------+---------+-----------------------+

I notice that in the slow one the index is not being used. so i checked the indexes:

Node 0:

+-----------------+------------+----------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table           | Non_unique | Key_name             | Seq_in_index | Column_name          | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------+------------+----------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| users_calls     |          0 | PRIMARY              |            1 | user_call_id         | A         |     2099153 |     NULL | NULL   |      | BTREE      |         |               |
| users_calls     |          1 | call_group_id        |            1 | call_group_id        | A         |       16659 |     NULL | NULL   |      | BTREE      |         |               |
| users_calls     |          1 | user_call_begin_date |            1 | user_call_begin_date | A         |     1049576 |     NULL | NULL   | YES  | BTREE      |         |               |
| users_calls     |          1 | user_call_begin_date |            2 | user_call_end_date   | A         |     2099153 |     NULL | NULL   | YES  | BTREE      |         |               |
| users_calls     |          1 | user_call_id         |            1 | user_call_id         | A         |     2099153 |     NULL | NULL   |      | BTREE      |         |               |
| users_calls     |          1 | user_call_id         |            2 | user_id              | A         |     2099153 |     NULL | NULL   |      | BTREE      |         |               |
| users_calls     |          1 | user_id              |            1 | user_id              | A         |       91267 |     NULL | NULL   |      | BTREE      |         |               |
| users_calls     |          1 | user_id              |            2 | call_id              | A         |     2099153 |     NULL | NULL   |      | BTREE      |         |               |
| users_calls     |          1 | user_id              |            3 | user_call_status     | A         |     2099153 |     NULL | NULL   |      | BTREE      |         |               |
| users_calls     |          1 | fk_users_calls_calls |            1 | call_id              | A         |       23067 |     NULL | NULL   |      | BTREE      |         |               |
+-----------------+------------+----------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

Node 4:

+-----------------+------------+----------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table           | Non_unique | Key_name             | Seq_in_index | Column_name          | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------+------------+----------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| users_calls     |          0 | PRIMARY              |            1 | user_call_id         | A         |     2091476 |     NULL | NULL   |      | BTREE      |         |               |
| users_calls     |          1 | call_group_id        |            1 | call_group_id        | A         |       26813 |     NULL | NULL   |      | BTREE      |         |               |
| users_calls     |          1 | user_call_begin_date |            1 | user_call_begin_date | A         |     1045738 |     NULL | NULL   | YES  | BTREE      |         |               |
| users_calls     |          1 | user_call_begin_date |            2 | user_call_end_date   | A         |     2091476 |     NULL | NULL   | YES  | BTREE      |         |               |
| users_calls     |          1 | user_call_id         |            1 | user_call_id         | A         |     2091476 |     NULL | NULL   |      | BTREE      |         |               |
| users_calls     |          1 | user_call_id         |            2 | user_id              | A         |     2091476 |     NULL | NULL   |      | BTREE      |         |               |
| users_calls     |          1 | user_id              |            1 | user_id              | A         |       53627 |     NULL | NULL   |      | BTREE      |         |               |
| users_calls     |          1 | user_id              |            2 | call_id              | A         |     2091476 |     NULL | NULL   |      | BTREE      |         |               |
| users_calls     |          1 | user_id              |            3 | user_call_status     | A         |     2091476 |     NULL | NULL   |      | BTREE      |         |               |
| users_calls     |          1 | fk_users_calls_calls |            1 | call_id              | A         |       15608 |     NULL | NULL   |      | BTREE      |         |               |
+-----------------+------------+----------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

My first question is why are the indexes different? it should be the same due to both are in the same cluster right?

Why the execute plans are different? Both are in the same cluster so, it should be the same

Should i use FORCE INDEX or STRAIGHT_JOIN?

like image 506
Sal00m Avatar asked Mar 31 '16 07:03

Sal00m


1 Answers

The answer is hidden in the two explain outputs - on node 4, the second join is using the index is_validated_with on the uca table, but the expected # of rows is 2755595, which is greater than the number of rows for a full scan of the uc table on the other plan.

With the available information it's hard to say for sure, but since nodes 0-3 are seeing active use while node4 isn't, my guess is that the statistics that the optimizer uses to decide on query plans may no longer reflect the actual state of the table. You could try to run ANALYZE TABLE on all three tables on all nodes and I suspect you'll see the same plan generated (assuming all nodes have the same data).

In addition to using FORCE_INDEX, you could also tweak the optimizer flags to try to favor one plan over another, but it's generally much better to fix the underlying issue, as you might fix this problem now only to get bitten by something else later on.

like image 184
atomic77 Avatar answered Oct 13 '22 01:10

atomic77