Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

join query optimization

recently I come up with some mysql performance issues, here's the query:

select
        assoc.id                    as id,
        parentNode.id               as parentNodeId,
        parentNode.version          as parentNodeVersion,
        parentStore.protocol        as parentNodeProtocol,
        parentStore.identifier      as parentNodeIdentifier,
        parentNode.uuid             as parentNodeUuid,
        childNode.id                as childNodeId,
        childNode.version           as childNodeVersion,
        childStore.protocol         as childNodeProtocol,
        childStore.identifier       as childNodeIdentifier,
        childNode.uuid              as childNodeUuid,
        assoc.type_qname_id         as type_qname_id,
        assoc.child_node_name_crc   as child_node_name_crc,
        assoc.child_node_name       as child_node_name,
        assoc.qname_ns_id           as qname_ns_id,
        assoc.qname_localname       as qname_localname,
        assoc.is_primary            as is_primary,
        assoc.assoc_index           as assoc_index


    from
        alf_child_assoc assoc
        join alf_node parentNode on (parentNode.id = assoc.parent_node_id)
        join alf_store parentStore on (parentStore.id = parentNode.store_id)
        join alf_node childNode on (childNode.id = assoc.child_node_id)
        join alf_store childStore on (childStore.id = childNode.store_id)

    where
        parentNode.id = 837
    order by
        assoc.assoc_index ASC,
        assoc.id ASC;

The problem is that the query took too long sorting and sending result set. Any help is appreciated, thanks in advance

some count info:

mysql> select count(id) from alf_child_assoc;
+-----------+
| count(id) |
+-----------+
|   7208882 |
+-----------+
1 row in set (12.51 sec)

mysql> select count(id) from alf_node;
+-----------+
| count(id) |
+-----------+
|   3986059 |
+-----------+
1 row in set (1.68 sec)

mysql> select count(id) from alf_store;
+-----------+
| count(id) |
+-----------+
|         6 |
+-----------+
1 row in set (0.00 sec)

This is the explain output:

+----+-------------+-------------+--------+---------------------------------------------------------------------+----------------+---------+--------------------------+--------+----------------+
| id | select_type | table       | type   | possible_keys                                                       | key            | key_len | ref                      | rows   | Extra          |
+----+-------------+-------------+--------+---------------------------------------------------------------------+----------------+---------+--------------------------+--------+----------------+
|  1 | SIMPLE      | parentNode  | const  | PRIMARY,store_id,fk_alf_node_store,will_store_node_idx              | PRIMARY        | 8       | const                    |      1 | Using filesort |
|  1 | SIMPLE      | parentStore | const  | PRIMARY                                                             | PRIMARY        | 8       | const                    |      1 |                |
|  1 | SIMPLE      | assoc       | ref    | parent_node_id,fk_alf_cass_pnode,fk_alf_cass_cnode,idx_alf_cass_pri | parent_node_id | 8       | const                    | 275218 | Using where    |
|  1 | SIMPLE      | childNode   | eq_ref | PRIMARY,store_id,fk_alf_node_store,will_store_node_idx              | PRIMARY        | 8       | repo.assoc.child_node_id |      1 |                |
|  1 | SIMPLE      | childStore  | eq_ref | PRIMARY                                                             | PRIMARY        | 8       | repo.childNode.store_id  |      1 |                |
+----+-------------+-------------+--------+---------------------------------------------------------------------+----------------+---------+--------------------------+--------+----------------+
5 rows in set (0.00 sec)

Below is some table-related informations:

mysql> desc alf_child_assoc;
+---------------------+--------------+------+-----+---------+----------------+
| Field               | Type         | Null | Key | Default | Extra          |
+---------------------+--------------+------+-----+---------+----------------+
| id                  | bigint(20)   | NO   | PRI | NULL    | auto_increment |
| version             | bigint(20)   | NO   |     | NULL    |                |
| parent_node_id      | bigint(20)   | NO   | MUL | NULL    |                |
| type_qname_id       | bigint(20)   | NO   | MUL | NULL    |                |
| child_node_name_crc | bigint(20)   | NO   |     | NULL    |                |
| child_node_name     | varchar(50)  | NO   |     | NULL    |                |
| child_node_id       | bigint(20)   | NO   | MUL | NULL    |                |
| qname_ns_id         | bigint(20)   | NO   | MUL | NULL    |                |
| qname_localname     | varchar(255) | NO   |     | NULL    |                |
| qname_crc           | bigint(20)   | NO   | MUL | NULL    |                |
| is_primary          | bit(1)       | YES  |     | NULL    |                |
| assoc_index         | int(11)      | YES  | MUL | NULL    |                |
+---------------------+--------------+------+-----+---------+----------------+
12 rows in set (0.00 sec)

mysql> desc alf_node;
+----------------+--------------+------+-----+---------+----------------+
| Field          | Type         | Null | Key | Default | Extra          |
+----------------+--------------+------+-----+---------+----------------+
| id             | bigint(20)   | NO   | PRI | NULL    | auto_increment |
| version        | bigint(20)   | NO   |     | NULL    |                |
| store_id       | bigint(20)   | NO   | MUL | NULL    |                |
| uuid           | varchar(36)  | NO   |     | NULL    |                |
| transaction_id | bigint(20)   | NO   | MUL | NULL    |                |
| node_deleted   | bit(1)       | NO   | MUL | NULL    |                |
| type_qname_id  | bigint(20)   | NO   | MUL | NULL    |                |
| locale_id      | bigint(20)   | NO   | MUL | NULL    |                |
| acl_id         | bigint(20)   | YES  | MUL | NULL    |                |
| audit_creator  | varchar(255) | YES  |     | NULL    |                |
| audit_created  | varchar(30)  | YES  |     | NULL    |                |
| audit_modifier | varchar(255) | YES  |     | NULL    |                |
| audit_modified | varchar(30)  | YES  |     | NULL    |                |
| audit_accessed | varchar(30)  | YES  |     | NULL    |                |
+----------------+--------------+------+-----+---------+----------------+
14 rows in set (0.00 sec)

mysql> desc alf_store;
+--------------+--------------+------+-----+---------+----------------+
| Field        | Type         | Null | Key | Default | Extra          |
+--------------+--------------+------+-----+---------+----------------+
| id           | bigint(20)   | NO   | PRI | NULL    | auto_increment |
| version      | bigint(20)   | NO   |     | NULL    |                |
| protocol     | varchar(50)  | NO   | MUL | NULL    |                |
| identifier   | varchar(100) | NO   |     | NULL    |                |
| root_node_id | bigint(20)   | YES  | MUL | NULL    |                |
+--------------+--------------+------+-----+---------+----------------+
5 rows in set (0.00 sec)

mysql> 
mysql> show index from alf_node;
+----------+------------+----------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table    | Non_unique | Key_name             | Seq_in_index | Column_name    | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------+------------+----------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| alf_node |          0 | PRIMARY              |            1 | id             | A         |     3890448 |     NULL | NULL   |      | BTREE      |         |               |
| alf_node |          0 | store_id             |            1 | store_id       | A         |          18 |     NULL | NULL   |      | BTREE      |         |               |
| alf_node |          0 | store_id             |            2 | uuid           | A         |     3890448 |     NULL | NULL   |      | BTREE      |         |               |
| alf_node |          1 | idx_alf_node_del     |            1 | node_deleted   | A         |          18 |     NULL | NULL   |      | BTREE      |         |               |
| alf_node |          1 | idx_alf_node_txn_del |            1 | transaction_id | A         |     3890448 |     NULL | NULL   |      | BTREE      |         |               |
| alf_node |          1 | idx_alf_node_txn_del |            2 | node_deleted   | A         |     3890448 |     NULL | NULL   |      | BTREE      |         |               |
| alf_node |          1 | fk_alf_node_acl      |            1 | acl_id         | A         |     3890448 |     NULL | NULL   | YES  | BTREE      |         |               |
| alf_node |          1 | fk_alf_node_txn      |            1 | transaction_id | A         |     3890448 |     NULL | NULL   |      | BTREE      |         |               |
| alf_node |          1 | fk_alf_node_store    |            1 | store_id       | A         |          18 |     NULL | NULL   |      | BTREE      |         |               |
| alf_node |          1 | fk_alf_node_tqn      |            1 | type_qname_id  | A         |          18 |     NULL | NULL   |      | BTREE      |         |               |
| alf_node |          1 | fk_alf_node_loc      |            1 | locale_id      | A         |          18 |     NULL | NULL   |      | BTREE      |         |               |
| alf_node |          1 | will_store_node_idx  |            1 | id             | A         |     3890448 |     NULL | NULL   |      | BTREE      |         |               |
| alf_node |          1 | will_store_node_idx  |            2 | store_id       | A         |     3890448 |     NULL | NULL   |      | BTREE      |         |               |
+----------+------------+----------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
13 rows in set (0.01 sec)

mysql> show index from alf_store;
+-----------+------------+-------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table     | Non_unique | Key_name          | Seq_in_index | Column_name  | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------+------------+-------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| alf_store |          0 | PRIMARY           |            1 | id           | A         |           6 |     NULL | NULL   |      | BTREE      |         |               |
| alf_store |          0 | protocol          |            1 | protocol     | A         |           6 |     NULL | NULL   |      | BTREE      |         |               |
| alf_store |          0 | protocol          |            2 | identifier   | A         |           6 |     NULL | NULL   |      | BTREE      |         |               |
| alf_store |          1 | fk_alf_store_root |            1 | root_node_id | A         |           6 |     NULL | NULL   | YES  | BTREE      |         |               |
+-----------+------------+-------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (0.00 sec)

mysql> show index from alf_child_assoc;
+-----------------+------------+--------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table           | Non_unique | Key_name           | Seq_in_index | Column_name         | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------+------------+--------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| alf_child_assoc |          0 | PRIMARY            |            1 | id                  | A         |     6956126 |     NULL | NULL   |      | BTREE      |         |               |
| alf_child_assoc |          0 | parent_node_id     |            1 | parent_node_id      | A         |      632375 |     NULL | NULL   |      | BTREE      |         |               |
| alf_child_assoc |          0 | parent_node_id     |            2 | type_qname_id       | A         |      632375 |     NULL | NULL   |      | BTREE      |         |               |
| alf_child_assoc |          0 | parent_node_id     |            3 | child_node_name_crc | A         |     6956126 |     NULL | NULL   |      | BTREE      |         |               |
| alf_child_assoc |          0 | parent_node_id     |            4 | child_node_name     | A         |     6956126 |     NULL | NULL   |      | BTREE      |         |               |
| alf_child_assoc |          1 | fk_alf_cass_pnode  |            1 | parent_node_id      | A         |      695612 |     NULL | NULL   |      | BTREE      |         |               |
| alf_child_assoc |          1 | fk_alf_cass_cnode  |            1 | child_node_id       | A         |     6956126 |     NULL | NULL   |      | BTREE      |         |               |
| alf_child_assoc |          1 | fk_alf_cass_tqn    |            1 | type_qname_id       | A         |          16 |     NULL | NULL   |      | BTREE      |         |               |
| alf_child_assoc |          1 | fk_alf_cass_qnns   |            1 | qname_ns_id         | A         |          16 |     NULL | NULL   |      | BTREE      |         |               |
| alf_child_assoc |          1 | idx_alf_cass_qncrc |            1 | qname_crc           | A         |     3478063 |     NULL | NULL   |      | BTREE      |         |               |
| alf_child_assoc |          1 | idx_alf_cass_qncrc |            2 | type_qname_id       | A         |     6956126 |     NULL | NULL   |      | BTREE      |         |               |
| alf_child_assoc |          1 | idx_alf_cass_qncrc |            3 | parent_node_id      | A         |     6956126 |     NULL | NULL   |      | BTREE      |         |               |
| alf_child_assoc |          1 | idx_alf_cass_pri   |            1 | parent_node_id      | A         |     1159354 |     NULL | NULL   |      | BTREE      |         |               |
| alf_child_assoc |          1 | idx_alf_cass_pri   |            2 | is_primary          | A         |     1159354 |     NULL | NULL   | YES  | BTREE      |         |               |
| alf_child_assoc |          1 | idx_alf_cass_pri   |            3 | child_node_id       | A         |     6956126 |     NULL | NULL   |      | BTREE      |         |               |
| alf_child_assoc |          1 | will_order_idx     |            1 | assoc_index         | A         |          16 |     NULL | NULL   | YES  | BTREE      |         |               |
| alf_child_assoc |          1 | will_order_idx     |            2 | id                  | A         |     6956126 |     NULL | NULL   |      | BTREE      |         |               |
+-----------------+------------+--------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
17 rows in set (0.16 sec)

after read @Solarflare's answer, I tried it out, now the query becomes:

SELECT STRAIGHT_JOIN
    assoc.id AS id,
    parentNode.id AS parentNodeId,
    parentNode.version AS parentNodeVersion,
    parentStore.protocol AS parentNodeProtocol,
    parentStore.identifier AS parentNodeIdentifier,
    parentNode.uuid AS parentNodeUuid,
    childNode.id AS childNodeId,
    childNode.version AS childNodeVersion,
    childStore.protocol AS childNodeProtocol,
    childStore.identifier AS childNodeIdentifier,
    childNode.uuid AS childNodeUuid,
    assoc.type_qname_id AS type_qname_id,
    assoc.child_node_name_crc AS child_node_name_crc,
    assoc.child_node_name AS child_node_name,
    assoc.qname_ns_id AS qname_ns_id,
    assoc.qname_localname AS qname_localname,
    assoc.is_primary AS is_primary,
    assoc.assoc_index AS assoc_index
FROM
    alf_child_assoc assoc FORCE INDEX (will_subq_idx)
        JOIN
    alf_node parentNode ON (parentNode.id = assoc.parent_node_id)
        JOIN
    alf_node childNode ON (childNode.id = assoc.child_node_id)
        JOIN
    alf_store parentStore ON (parentStore.id = parentNode.store_id)
        JOIN
    alf_store childStore ON (childStore.id = childNode.store_id)
WHERE
    parentNode.id = 550
ORDER BY assoc.assoc_index ASC , assoc.id ASC;

and the explain shows:

+----+-------------+-------------+------------+--------+-----------------------------------------------------------------------+---------------+---------+--------------------------+--------+----------+---------------------------------------+
| id | select_type | table       | partitions | type   | possible_keys                                                         | key           | key_len | ref                      | rows   | filtered | Extra                                 |
+----+-------------+-------------+------------+--------+-----------------------------------------------------------------------+---------------+---------+--------------------------+--------+----------+---------------------------------------+
|  1 | SIMPLE      | assoc       | NULL       | ref    | will_subq_idx                                                         | will_subq_idx | 8       | const                    | 303104 |   100.00 | Using index condition; Using filesort |
|  1 | SIMPLE      | parentNode  | NULL       | const  | PRIMARY,store_id,fk_alf_node_store,will_store_node_idx,will_store_idx | PRIMARY       | 8       | const                    |      1 |   100.00 | NULL                                  |
|  1 | SIMPLE      | childNode   | NULL       | eq_ref | PRIMARY,store_id,fk_alf_node_store,will_store_node_idx,will_store_idx | PRIMARY       | 8       | repo.assoc.child_node_id |      1 |   100.00 | NULL                                  |
|  1 | SIMPLE      | parentStore | NULL       | eq_ref | PRIMARY,will_store_idx                                                | PRIMARY       | 8       | repo.parentNode.store_id |      1 |   100.00 | NULL                                  |
|  1 | SIMPLE      | childStore  | NULL       | eq_ref | PRIMARY,will_store_idx                                                | PRIMARY       | 8       | repo.childNode.store_id  |      1 |   100.00 | NULL                                  |
+----+-------------+-------------+------------+--------+-----------------------------------------------------------------------+---------------+---------+--------------------------+--------+----------+---------------------------------------+
5 rows in set, 1 warning (0.03 sec)

this is the query plan showed by mysql workbench:

enter image description here

Thanks for you guys, you're awesome!

like image 910
oxnz Avatar asked May 25 '16 11:05

oxnz


1 Answers

The relevant key you need here is alf_store(parent_node_id, child_node_id). You have the index idx_alf_cass_pri with (parent_node_id, is_primary, child_node_id) that is not used.

Please try the following steps to optimize your query:

Create the index alf_store(parent_node_id, child_node_id).

A second index you might need later is alf_node(id, store_id, version, uuid) (you can change the order of version and uuid, depending on other use cases), so add it now too.

Execute/Explain the following central query

select parentNode.id as parentNodeID, childnode.id as childNodeID, 
       assoc.id as assocID
from alf_node as parentNode
join alf_child_assoc as assoc on (parentNode.id = assoc.parent_node_id)
join alf_node childNode on (childNode.id = assoc.child_node_id)
where parentNode.id = 837;

This will determine the minimal run time of your query. Based on your tables sizes and given timings, it should take about 0.5 seconds, at least if you run it a second time to have everything buffered (and of course the time required to return the rows, so do select count(*) from alf_store as ... for testing this time).

The next step is to join this with the store tables and fetching the remaining columns from the table. There are some possible methods that should, in a perfect database, perform the same, but in reality depend a little on your table contents, you might have to test it:

select assoc.id as id, 
       ...
from (
   select parentNode.id as parentNodeID, childnode.id as childNodeID, 
          assoc.id as assocID
   from alf_node as parentNode
   join alf_child_assoc as assoc on (parentNode.id = assoc.parent_node_id)
   join alf_node childNode on (childNode.id = assoc.child_node_id)
   where parentNode.id = 837
) as subquery
join alf_child_assoc assoc on (assoc.id = subquery.assocID)
join alf_node parentNode on (parentNode.id = subquery.parentnodeID)
-- join alf_store parentStore on (parentStore.id = parentNode.store_id)
join alf_node childNode on (childNode.id = subquery.childNodeID)
-- join alf_store childStore on (childStore.id = childNode.store_id)
order by
    assoc.assoc_index ASC,
    assoc.id ASC;

In a first step, leave away the 6 rows from store information - they should have no impact, but might screw something up, so test without them first; if they make an impact, there are ways to force mysql to optimize it correctly.

The covering index alf_node(id, store_id, version, uuid) from before should give you a better result on the outer join, so add it now if you haven't yet. You might add other covering indexes, but the only usefull other index would be on assoc, and that would mean basically all data in the table, and that might be overkill.

Always check the explains to see if the inner subquery still is the same, especially if it gets slower at one point. You might have to force mysql to use the correct index, because it might take another optimization (that can be correct, but can be wrong).

It might still take you some seconds to do all the lookups, depending on your data, data fragmentation and some other things.

Another option would be to gradually add the outer information to the inner query so you can skip some of the additional joins that I introduced on the outer query again:

select assoc.id as id, 
       ...
       subquery.ParentNodeVersion,
       subquery.ParentNodeUuid,
       ...
from (
   select parentNode.id as parentNodeID, childnode.id as childNodeID, 
          assoc.id as assocID, 
          parentNode.store_Id as ParentNodeStoreID,
          parentNode.Version as ParentNodeVersion,
          parentNode.uuid as ParentNodeUuid,
          childNode.store_Id as ChildNodeStoreID,
          childNode.Version as ChildNodeVersion,
          childNode.uuid as ChildNodeUuid
   from alf_node as parentNode
   join alf_child_assoc as assoc on (parentNode.id = assoc.parent_node_id)
   join alf_node childNode on (childNode.id = assoc.child_node_id)
   where parentNode.id = 837
) as subquery
join alf_child_assoc assoc on (assoc.id = subquery.assocID)
-- join alf_store parentStore on (parentStore.id = subquery.parentNodeStoreId)
-- join alf_store childStore on (childStore.id = subquery.ChildNodeStoreID)
order by
    assoc.assoc_index ASC,
    assoc.id ASC;

Again, start with a version without stores, and it can be better to add them inside the subquery, too.

This is sometimes a little try and error to optimize the subquery and/or covering indexes, you might test, check explains and ask again if you get a specific bottleneck.

The third option is to use your original query (that would be the equivalent of gradually inserting all columns inside the subquery, thus making it not a subquery anymore).

All options should actually have the same performance with this indexes because they express the same thing, but unfortunately that isn't always the reality, because it depends on correct statistics, and these aren't perfect, so you have to test it a bit.

But as I said: the main point is to use the right index on the central 3-table-join, the other point is to try to reduce table lookups by e.g. covering indexes or reordering the query.

like image 97
Solarflare Avatar answered Oct 14 '22 23:10

Solarflare