Right I have no idea why but this query takes well over 6 seconds to execute, index's are all setup correctly and if I run each query separately it works great with less than 0.5 seconds to execute.
Here is the query
SELECT c.supplier_id, supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name,
(SELECT COUNT(*)
FROM supplier_questions q1
WHERE c.supplier_id = q1.supplier_id AND q1.incomplete = '0') AS questions,
IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated,
(SELECT COUNT(*)
FROM supplier_questions q2
WHERE c.supplier_id = q2.supplier_id AND q2.reviewed = '1') AS reviewed,
questapproved,
ss.supplier_no AS supplier_no
FROM suppliers c
INNER JOIN supplier_site ss ON c.supplier_id = ss.supplier_id
WHERE c.supplier_id != '0' AND ss.site_id = '2'
GROUP BY c.supplier_id
ORDER BY c.supplier_name ASC
LIMIT 0, 20
Results of the Explain query is as follows
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY ss ref site_id,supplier_id site_id 4 const 1287 Using where; Using temporary; Using filesort
1 PRIMARY c eq_ref PRIMARY PRIMARY 4 ss.supplier_id 1
3 DEPENDENT SUBQUERY q2 ref supplier_id,reviewed reviewed 4 const 263 Using where
2 DEPENDENT SUBQUERY q1 ref supplier_id,incomplete incomplete 4 const 254 Using where
The reason the count queries are in there is because I need to know the number of rows from those tables, this can't be done in another query as the results also need to be sorted by those values :(
For multiple-table subqueries, execution of NULL IN (SELECT ...) is particularly slow because the join optimizer does not optimize for the case where the outer expression is NULL .
They both scanned 1.3MB. So there you have it– CTEs and subqueries are the exact same in terms of performance. Since in the CTE the query is on its own and not embedded within another FROM or JOIN statement, it can help logically separate parts of your query.
The retrieval time of the query using joins almost always will be faster than that of a subquery. By using joins, you can maximize the calculation burden on the database i.e., instead of multiple queries using one join query.
As a stab in the dark, does this run faster? (I havent got a mysql to verify the syntax on, so forgive any slight mistakes, but you might get the idea)
SELECT c.supplier_id, supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name, questions, reviewed
IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated,
questapproved, ss.supplier_no AS supplier_no
FROM suppliers c
INNER JOIN supplier_site ss ON c.supplier_id = ss.supplier_id
inner join
(SELECT supplier_id, sum(if(incomplete='0',1,0)) as questions, sum(if(incomplete='1',1,0)) as reviewed FROM supplier_questions q1 group by supplier_id) as tmp
on c.supplier_id = tmp.supplier_id
WHERE c.supplier_id != '0' AND ss.site_id = '2'
GROUP BY c.supplier_id
ORDER BY c.supplier_name ASC LIMIT 0, 20
FROM suppliers c
INNER JOIN supplier_site ss ON c.supplier_id = ss.supplier_id
WHERE c.supplier_id != '0' AND ss.site_id = '2'
GROUP BY c.supplier_id
ORDER BY c.supplier_name ASC
Since autogenerated primary keys are never equal to 0 (unless big db design mistake) you can drop the c.supplier_id != '0' clause.
ss.site_id = '2' should be in the JOIN condition for readability.
It looks like this should match only one row in table supplier_site per supplier (if this is your usual 1-N thing-addresses relation, ie you're selecting the second address of each supplier, maybe '2' corresponds to 'billing address' or something) so the GROUP BY c.supplier_id is useless. If the GROUP BY actually does something, then the query is wrong, since the "address" columns, which presumably come from supplier_site table, would come from a random row.
So here's the simplified FROM (the WHERE is gone) :
FROM suppliers c
INNER JOIN supplier_site ss ON
(c.supplier_id = ss.supplier_id AND ss.site_id = '2')
ORDER BY c.supplier_name ASC
I suppose you got an index on c.supplier_name so this part of the query should be very fast.
Now try this query :
SELECT a.*,
questapproved,
ss.supplier_no AS supplier_no,
IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated,
sum( q.incomplete = '0') AS questions,
sum( q.reviewed = '1' ) AS reviewed
FROM
(
SELECT c.supplier_id, supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name
FROM suppliers c
INNER JOIN supplier_site ss ON
(c.supplier_id = ss.supplier_id AND ss.site_id = '2')
ORDER BY c.supplier_name ASC
LIMIT 0, 20
) a
LEFT JOIN supplier_questions q ON (q.supplier_id = c.supplier_id)
GROUP BY c.supplier_id
ORDER BY c.supplier_name;
If you remove the sub-selects you end up with something like this:
SELECT c.supplier_id, supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name,
COUNT(IF (q1.incomplete = '0', '0', null)) AS questions,
IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated,
COUNT(IF (q1.reviewed = '1', '1', null)) AS reviewed,
questapproved,
ss.supplier_no AS supplier_no
FROM suppliers c
INNER JOIN supplier_site ss ON c.supplier_id = ss.supplier_id
LEFT OUTER JOIN supplier_questions q1 ON c.supplier_id = q1.supplier_id
WHERE c.supplier_id != '0' AND ss.site_id = '2'
GROUP BY c.supplier_id
ORDER BY c.supplier_name ASC
LIMIT 0, 20
I don't have a MySQL database available so there may be errors in my SQL. The idea is to remove the subqueries and replace them with an outer join and use IF to only count relevant rows.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With