I have two tables table1 and table2 their definitions are:
CREATE `table1` (
'table1_id' int(11) NOT NULL AUTO_INCREMENT,
'table1_name' VARCHAR(256),
PRIMARY KEY ('table1_id')
)
CREATE `table2` (
'table2_id' int(11) NOT NULL AUTO_INCREMENT,
'table1_id' int(11) NOT NULL,
'table1_name' VARCHAR(256),
PRIMARY KEY ('table2_id'),
FOREIGN KEY ('table1_id') REFERENCES 'table1' ('table1_id')
)
I want to know the number of rows in table1 that are NOT referenced in table2, that can be done with:
SELECT COUNT(t1.table1_id) FROM table1 t1
WHERE t1.table1_id NOT IN (SELECT t2.table1_id FROM table2 t2)
Is there a more efficient way of performing this query?
Upgrade to MySQL 5.6, which optimizes semi-joins against subqueries better.
See http://dev.mysql.com/doc/refman/5.6/en/subquery-optimization.html
Or else use an exclusion join:
SELECT COUNT(t1.table1_id) FROM table1 t1
LEFT OUTER JOIN table2 t2 USING (table1_id)
WHERE t2.table1_id IS NULL
Also, make sure table2.table1_id has an index on it.
try using EXISTS.. its generally more efficient than IN
SELECT COUNT(t1.table1_id)
FROM table1 t1
WHERE EXISTS
( SELECT 1
FROM table2 t2
WHERE t2.table1_id <=> t1.table1_id
)
you can do it with NOT EXISTS as well
SELECT COUNT(t1.table1_id)
FROM table1 t1
WHERE NOT EXISTS
( SELECT 1
FROM table2 t2
WHERE t2.table1_id = t1.table1_id
)
EXISTS is generally faster because the execution plan is once it finds a hit, it will quit searching since the condition has proved true. The problem with IN is it will collect all the results from the subquery before further processing... and that takes longer
As @billkarwin noted in the comments EXISTS is using a dependent subquery.. Here is the explain on my two queries and also the OP's query.. http://sqlfiddle.com/#!2/53199d/5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With