Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL query with JOIN and GROUP BY optimization. Is it possible?

I have two tables: gpnxuser and key_value

mysql> describe gpnxuser;
+--------------+--------------+------+-----+---------+----------------+
| Field        | Type         | Null | Key | Default | Extra          |
+--------------+--------------+------+-----+---------+----------------+
| id           | bigint(20)   | NO   | PRI | NULL    | auto_increment |
| version      | bigint(20)   | NO   |     | NULL    |                |
| email        | varchar(255) | YES  |     | NULL    |                |
| uuid         | varchar(255) | NO   | MUL | NULL    |                |
| partner_id   | bigint(20)   | NO   | MUL | NULL    |                |
| password     | varchar(255) | YES  |     | NULL    |                |
| date_created | datetime     | YES  |     | NULL    |                |
| last_updated | datetime     | YES  |     | NULL    |                |
+--------------+--------------+------+-----+---------+----------------+

and

mysql> describe key_value;
+----------------+--------------+------+-----+---------+----------------+
| Field          | Type         | Null | Key | Default | Extra          |
+----------------+--------------+------+-----+---------+----------------+
| id             | bigint(20)   | NO   | PRI | NULL    | auto_increment |
| version        | bigint(20)   | NO   |     | NULL    |                |
| date_created   | datetime     | YES  |     | NULL    |                |
| last_updated   | datetime     | YES  |     | NULL    |                |
| upkey          | varchar(255) | NO   | MUL | NULL    |                |
| user_id        | bigint(20)   | YES  | MUL | NULL    |                |
| security_level | int(11)      | NO   |     | NULL    |                |
+----------------+--------------+------+-----+---------+----------------+

key_value.user_id is FK that references gpnxuser.id. I also have an index in gpnxuser.partner_id which is a FK that references a table called "partner" (which, I think, does not matter much to this question).

For partner_id = 64, I have 500K rows in gpnxuser which have relationship with approximatelly 6M rows in key_value.

I wanted to have a query that returned all distinct 'key_value.upkey' for user´s belonging to a given partner. I did something like this:

select upkey from gpnxuser join key_value on gpnxuser.id=key_value.user_id where partner_id=64 group by upkey;

which takes forever to run. The explain for the query looks like:

mysql> explain select upkey from gpnxuser join key_value on gpnxuser.id=key_value.user_id where partner_id=64 group by upkey;

    +----+-------------+-----------+------+----------------------------+--------------------+---------+-----------------------------+--------+----------------------------------------------+
    | id | select_type | table     | type | possible_keys              | key                | key_len | ref                         | rows   | Extra                                        |
    +----+-------------+-----------+------+----------------------------+--------------------+---------+-----------------------------+--------+----------------------------------------------+
    |  1 | SIMPLE      | gpnxuser  | ref  | PRIMARY,FKB2D9FEBE725C505E | FKB2D9FEBE725C505E | 8       | const                       | 259640 | Using index; Using temporary; Using filesort |
    |  1 | SIMPLE      | key_value | ref  | FK9E0C0F912D11F5A9         | FK9E0C0F912D11F5A9 | 9       | gpnx_finance_db.gpnxuser.id |     14 | Using where                                  |
    +----+-------------+-----------+------+----------------------------+--------------------+---------+-----------------------------+--------+----------------------------------------------+

My question is: is there a query that can run fast and obtain the result that I want?

like image 438
Manuel Guimarães Pinto Filho Avatar asked Nov 12 '22 06:11

Manuel Guimarães Pinto Filho


1 Answers

what you need to do is utilize EXISTS statement: This will cause only partial table scan until a match found and not more.

select upkey from (select distinct upkey from key_value) upk 
where EXISTS 
    (select 1 from gpnxuser u, key_value kv 
     where u.id=kv.user_id and partner_id=1 and kv.upkey = upk.upkey)

NB. In the original query, group by is misused: distinct looks better there.

select DISTINCT upkey from gpnxuser join key_value on 
gpnxuser.id=key_value.user_id where partner_id=1
like image 128
snowindy Avatar answered Nov 14 '22 23:11

snowindy