I have a problem with this query:
SELECT
uca.user_activity_id,
uca.user_call_id,
uca.call_activity_id,
uca.user_activity_token,
uc.call_group_id,
uc.user_id
FROM users_calls_activities uca
INNER JOIN users_calls_activities uca2 ON uca2.user_activity_id = uca.user_activity_is_validated_with
AND aux.user_call_id = 1744136
INNER JOIN users_calls uc ON uc.user_call_id = uca.user_call_id;
We have a cluster with percona server (5.6.29) with 5 nodes(from 0 to 4) in Azure. The difference between nodes 0-3 and 4 is that, the first ones are in a balancer and the node 4 is out of the balancer (but in the cluster)
The problem is that in four of the servers (nodes 0-3) the query is really slow (15 sec) and in the other one (node 4)the query is really fast (0,002)
Afaik, the explain plan should be the same but i execute an EXPLAIN
and the result is this:
Nodes 0-3 (Slow)
+----+-------------+-------+------+-------------------------------------------------------------+--------------+---------+-------------------------------+---------+---------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+-------------------------------------------------------------+--------------+---------+-------------------------------+---------+---------------------------------------+ | 1 | SIMPLE | uca2 | ref | PRIMARY,user_call_id,user_call_id_2 | user_call_id | 4 | const | 1 | Using index | | 1 | SIMPLE | uc | ALL | PRIMARY,user_call_id | NULL | NULL | NULL | 2098152 | Using join buffer (Block Nested Loop) | | 1 | SIMPLE | uca | ref | user_call_id,user_call_id_2,is_validated_with | user_call_id | 4 | db.uc.user_call_id | 1 | Using where | +----+-------------+-------+------+-------------------------------------------------------------+--------------+---------+-------------------------------+---------+---------------------------------------+
Node 4 (Fast)
+----+-------------+-------+--------+-------------------------------------------------------------+---------------------------------+---------+-----------------------------------+---------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+-------------------------------------------------------------+---------------------------------+---------+-----------------------------------+---------+-----------------------+ | 1 | SIMPLE | uca2 | ref | PRIMARY,user_call_id,user_call_id_2 | user_call_id | 4 | const | 1 | Using index | | 1 | SIMPLE | uca | ref | user_call_id,user_call_id_2,is_validated_with | is_validated_with | 5 | db.uc2.user_activity_id | 2755595 | Using index condition | | 1 | SIMPLE | uc | eq_ref | PRIMARY,user_call_id | PRIMARY | 4 | db.uca.user_call_id | 1 | NULL | +----+-------------+-------+--------+-------------------------------------------------------------+---------------------------------+---------+-----------------------------------+---------+-----------------------+
I notice that in the slow one the index is not being used. so i checked the indexes:
Node 0:
+-----------------+------------+----------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +-----------------+------------+----------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | users_calls | 0 | PRIMARY | 1 | user_call_id | A | 2099153 | NULL | NULL | | BTREE | | | | users_calls | 1 | call_group_id | 1 | call_group_id | A | 16659 | NULL | NULL | | BTREE | | | | users_calls | 1 | user_call_begin_date | 1 | user_call_begin_date | A | 1049576 | NULL | NULL | YES | BTREE | | | | users_calls | 1 | user_call_begin_date | 2 | user_call_end_date | A | 2099153 | NULL | NULL | YES | BTREE | | | | users_calls | 1 | user_call_id | 1 | user_call_id | A | 2099153 | NULL | NULL | | BTREE | | | | users_calls | 1 | user_call_id | 2 | user_id | A | 2099153 | NULL | NULL | | BTREE | | | | users_calls | 1 | user_id | 1 | user_id | A | 91267 | NULL | NULL | | BTREE | | | | users_calls | 1 | user_id | 2 | call_id | A | 2099153 | NULL | NULL | | BTREE | | | | users_calls | 1 | user_id | 3 | user_call_status | A | 2099153 | NULL | NULL | | BTREE | | | | users_calls | 1 | fk_users_calls_calls | 1 | call_id | A | 23067 | NULL | NULL | | BTREE | | | +-----------------+------------+----------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
Node 4:
+-----------------+------------+----------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +-----------------+------------+----------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | users_calls | 0 | PRIMARY | 1 | user_call_id | A | 2091476 | NULL | NULL | | BTREE | | | | users_calls | 1 | call_group_id | 1 | call_group_id | A | 26813 | NULL | NULL | | BTREE | | | | users_calls | 1 | user_call_begin_date | 1 | user_call_begin_date | A | 1045738 | NULL | NULL | YES | BTREE | | | | users_calls | 1 | user_call_begin_date | 2 | user_call_end_date | A | 2091476 | NULL | NULL | YES | BTREE | | | | users_calls | 1 | user_call_id | 1 | user_call_id | A | 2091476 | NULL | NULL | | BTREE | | | | users_calls | 1 | user_call_id | 2 | user_id | A | 2091476 | NULL | NULL | | BTREE | | | | users_calls | 1 | user_id | 1 | user_id | A | 53627 | NULL | NULL | | BTREE | | | | users_calls | 1 | user_id | 2 | call_id | A | 2091476 | NULL | NULL | | BTREE | | | | users_calls | 1 | user_id | 3 | user_call_status | A | 2091476 | NULL | NULL | | BTREE | | | | users_calls | 1 | fk_users_calls_calls | 1 | call_id | A | 15608 | NULL | NULL | | BTREE | | | +-----------------+------------+----------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
My first question is why are the indexes different? it should be the same due to both are in the same cluster right?
Why the execute plans are different? Both are in the same cluster so, it should be the same
Should i use FORCE INDEX
or STRAIGHT_JOIN
?
The answer is hidden in the two explain outputs - on node 4, the second join is using the index is_validated_with
on the uca
table, but the expected # of rows is 2755595, which is greater than the number of rows for a full scan of the uc
table on the other plan.
With the available information it's hard to say for sure, but since nodes 0-3 are seeing active use while node4 isn't, my guess is that the statistics that the optimizer uses to decide on query plans may no longer reflect the actual state of the table. You could try to run ANALYZE TABLE on all three tables on all nodes and I suspect you'll see the same plan generated (assuming all nodes have the same data).
In addition to using FORCE_INDEX
, you could also tweak the optimizer flags to try to favor one plan over another, but it's generally much better to fix the underlying issue, as you might fix this problem now only to get bitten by something else later on.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With