Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I speed up MySQL query with multiple joins

Tags:

sql

join

mysql

Here is my issue, I am selecting and doing multiple joins to get the correct items...it pulls in a fair amount of rows, above 100,000. This query takes more than 5mins when the date range is set to 1 year.

I don't know if it's possible but I am afraid that the user might extend the date range to like ten years and crash it.

Anyone know how I can speed this up? Here is the query.

SELECT DISTINCT t1.first_name, t1.last_name, t1.email 
FROM table1 AS t1 
INNER JOIN table2 AS t2 ON t1.CU_id = t2.O_cid 
INNER JOIN table3 AS t3 ON t2.O_ref = t3.I_oref 
INNER JOIN table4 AS t4 ON t3.I_pid = t4.P_id 
INNER JOIN table5 AS t5 ON t4.P_cat = t5.C_id 
WHERE t1.subscribe =1 
AND t1.Cdate >= $startDate
AND t1.Cdate <= $endDate
AND t5.store =2

I am not the greatest with mysql so any help would be appreciated!

Thanks in advance!

UPDATE

Here is the explain you asked for

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   SIMPLE  t5  ref     PRIMARY,C_store_type,C_id,C_store_type_2    C_store_type_2  1   const   101     Using temporary
1   SIMPLE  t4  ref     PRIMARY,P_cat   P_cat   5   alphacom.t5.C_id    326     Using where
1   SIMPLE  t3  ref     I_pid,I_oref    I_pid   4   alphacom.t4.P_id    31   
1   SIMPLE  t2  eq_ref  O_ref,O_cid     O_ref   28  alphacom.t3.I_oref  1    
1   SIMPLE  t1  eq_ref  PRIMARY     PRIMARY     4   alphacom.t2.O_cid   1   Using where

Also I added an index to table5 rows and table4 rows because they don't really change, however the other tables get around 500-1000 entries a month... I heard you should add an index to a table that has that many new entries....is this true?

like image 762
BinarySolo00100 Avatar asked Oct 21 '10 16:10

BinarySolo00100


People also ask

Do joins make queries slow?

Joins: If your query joins two tables in a way that substantially increases the row count of the result set, your query is likely to be slow. There's an example of this in the subqueries lesson. Aggregations: Combining multiple rows to produce a result requires more computation than simply retrieving those rows.

Which joins is faster in MySQL?

performance - Mysql - LEFT JOIN way faster than INNER JOIN - Stack Overflow.


3 Answers

I'd try the following:

First, ensure there are indexes on the following tables and columns (each set of columns in parentheses should be a separate index):

table1 : (subscribe, CDate)
         (CU_id)
table2 : (O_cid)
         (O_ref)
table3 : (I_oref)
         (I_pid)
table4 : (P_id)
         (P_cat)
table5 : (C_id, store)

Second, if adding the above indexes didn't improve things as much as you'd like, try rewriting the query as

SELECT DISTINCT t1.first_name, t1.last_name, t1.email FROM
  (SELECT CU_id, t1.first_name, t1.last_name, t1.email
     FROM table1
     WHERE subscribe = 1 AND
           CDate >= $startDate AND
           CDate <= $endDate) AS t1
  INNER JOIN table2 AS t2
    ON t1.CU_id = t2.O_cid   
  INNER JOIN table3 AS t3
    ON t2.O_ref = t3.I_oref   
  INNER JOIN table4 AS t4
    ON t3.I_pid = t4.P_id   
  INNER JOIN (SELECT C_id FROM table5 WHERE store = 2) AS t5
    ON t4.P_cat = t5.C_id

I'm hoping here that the first sub-select would cut down significantly on the number of rows to be considered for joining, hopefully making the subsequent joins do less work. Ditto the reasoning behind the second sub-select on table5.

In any case, mess with it. I mean, ultimately it's just a SELECT - you can't really hurt anything with it. Examine the plans that are generated by each different permutation and try to figure out what's good or bad about each.

Share and enjoy.

like image 160

Make sure your date columns and all the columns you are joining on are indexed.

Doing an unequivalence operator on your dates means it checks every row, which is inherently slower than an equivalence.

Also, using DISTINCT adds an extra comparison to the logic that your optimizer is running behind the scenes. Eliminate that if possible.

like image 34
JNK Avatar answered Oct 03 '22 22:10

JNK


Well, first, make a subquery to decimate table1 down to just the records you actually want to go to all the trouble of joining...

SELECT DISTINCT t1.first_name, t1.last_name, t1.email  
FROM (  
SELECT first_name, last_name, email, CU_id FROM table1 WHERE  
table1.subscribe = 1  
AND table1.Cdate >= $startDate  
AND table1.Cdate <= $endDate  
) AS t1  
INNER JOIN table2 AS t2 ON t1.CU_id = t2.O_cid  
INNER JOIN table3 AS t3 ON t2.O_ref = t3.I_oref  
INNER JOIN table4 AS t4 ON t3.I_pid = t4.P_id  
INNER JOIN table5 AS t5 ON t4.P_cat = t5.C_id  
WHERE t5.store = 2

Then start looking at modifying the directionality of the joins.

Additionally, if t5.store is only very rarely 2, then flip this idea around: construct the t5 subquery, then join it back and back and back.

like image 35
Eric Towers Avatar answered Oct 03 '22 22:10

Eric Towers