Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Slow MYSQL query with sub queries using COUNT

Right I have no idea why but this query takes well over 6 seconds to execute, index's are all setup correctly and if I run each query separately it works great with less than 0.5 seconds to execute.

Here is the query

SELECT c.supplier_id, supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name,
(SELECT COUNT(*)
    FROM supplier_questions q1
    WHERE c.supplier_id = q1.supplier_id AND q1.incomplete = '0') AS questions, 
IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated, 
(SELECT COUNT(*)
    FROM supplier_questions q2
    WHERE c.supplier_id = q2.supplier_id AND q2.reviewed = '1') AS reviewed, 
questapproved, 
ss.supplier_no AS supplier_no
FROM suppliers c
INNER JOIN supplier_site ss ON c.supplier_id = ss.supplier_id
WHERE c.supplier_id != '0' AND ss.site_id = '2'
GROUP BY c.supplier_id
ORDER BY c.supplier_name ASC
LIMIT 0, 20

Results of the Explain query is as follows

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   PRIMARY ss  ref site_id,supplier_id site_id 4   const   1287    Using where; Using temporary; Using filesort
1   PRIMARY c   eq_ref  PRIMARY PRIMARY 4   ss.supplier_id  1   
3   DEPENDENT SUBQUERY  q2  ref supplier_id,reviewed    reviewed    4   const   263 Using where
2   DEPENDENT SUBQUERY  q1  ref supplier_id,incomplete  incomplete  4   const   254 Using where

The reason the count queries are in there is because I need to know the number of rows from those tables, this can't be done in another query as the results also need to be sorted by those values :(

like image 612
Neo Avatar asked May 24 '11 10:05

Neo


People also ask

Why are sub queries slow?

For multiple-table subqueries, execution of NULL IN (SELECT ...) is particularly slow because the join optimizer does not optimize for the case where the outer expression is NULL .

Is CTE faster than subquery?

They both scanned 1.3MB. So there you have it– CTEs and subqueries are the exact same in terms of performance. Since in the CTE the query is on its own and not embedded within another FROM or JOIN statement, it can help logically separate parts of your query.

Are sub queries faster than joins?

The retrieval time of the query using joins almost always will be faster than that of a subquery. By using joins, you can maximize the calculation burden on the database i.e., instead of multiple queries using one join query.


3 Answers

As a stab in the dark, does this run faster? (I havent got a mysql to verify the syntax on, so forgive any slight mistakes, but you might get the idea)

SELECT c.supplier_id, supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name, questions, reviewed 
IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated,  
questapproved,  ss.supplier_no AS supplier_no 
FROM suppliers c 
INNER JOIN supplier_site ss ON c.supplier_id = ss.supplier_id 
inner join 
(SELECT supplier_id, sum(if(incomplete='0',1,0)) as questions,  sum(if(incomplete='1',1,0)) as reviewed FROM supplier_questions q1 group by supplier_id) as tmp
on c.supplier_id = tmp.supplier_id
WHERE c.supplier_id != '0' AND ss.site_id = '2' 
GROUP BY c.supplier_id 
ORDER BY c.supplier_name ASC LIMIT 0, 20 
like image 94
BugFinder Avatar answered Oct 16 '22 00:10

BugFinder


FROM suppliers c
INNER JOIN supplier_site ss ON c.supplier_id = ss.supplier_id
WHERE c.supplier_id != '0' AND ss.site_id = '2'
GROUP BY c.supplier_id 
ORDER BY c.supplier_name ASC

Since autogenerated primary keys are never equal to 0 (unless big db design mistake) you can drop the c.supplier_id != '0' clause.

ss.site_id = '2' should be in the JOIN condition for readability.

It looks like this should match only one row in table supplier_site per supplier (if this is your usual 1-N thing-addresses relation, ie you're selecting the second address of each supplier, maybe '2' corresponds to 'billing address' or something) so the GROUP BY c.supplier_id is useless. If the GROUP BY actually does something, then the query is wrong, since the "address" columns, which presumably come from supplier_site table, would come from a random row.

So here's the simplified FROM (the WHERE is gone) :

FROM suppliers c
INNER JOIN supplier_site ss ON 
    (c.supplier_id = ss.supplier_id AND ss.site_id = '2')
ORDER BY c.supplier_name ASC

I suppose you got an index on c.supplier_name so this part of the query should be very fast.

Now try this query :

SELECT a.*,
    questapproved, 
    ss.supplier_no AS supplier_no,
    IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated, 
    sum( q.incomplete = '0') AS questions,
    sum( q.reviewed = '1' ) AS reviewed
FROM
(
    SELECT c.supplier_id, supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name
    FROM suppliers c
    INNER JOIN supplier_site ss ON 
        (c.supplier_id = ss.supplier_id AND ss.site_id = '2')
    ORDER BY c.supplier_name ASC
    LIMIT 0, 20
) a
LEFT JOIN supplier_questions q ON (q.supplier_id = c.supplier_id)
GROUP BY c.supplier_id
ORDER BY c.supplier_name;
like image 36
bobflux Avatar answered Oct 16 '22 02:10

bobflux


If you remove the sub-selects you end up with something like this:

SELECT c.supplier_id, supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name,
COUNT(IF (q1.incomplete = '0', '0', null)) AS questions, 
IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated, 
COUNT(IF (q1.reviewed = '1', '1', null)) AS reviewed,
questapproved, 
ss.supplier_no AS supplier_no
FROM suppliers c
INNER JOIN supplier_site ss ON c.supplier_id = ss.supplier_id
LEFT OUTER JOIN supplier_questions q1 ON c.supplier_id = q1.supplier_id
WHERE c.supplier_id != '0' AND ss.site_id = '2'
GROUP BY c.supplier_id
ORDER BY c.supplier_name ASC
LIMIT 0, 20

I don't have a MySQL database available so there may be errors in my SQL. The idea is to remove the subqueries and replace them with an outer join and use IF to only count relevant rows.

like image 24
Klas Lindbäck Avatar answered Oct 16 '22 02:10

Klas Lindbäck