More efficient query than NOT IN (nested select)

Question

I have two tables table1 and table2 their definitions are:

CREATE `table1` (
    'table1_id' int(11) NOT NULL AUTO_INCREMENT,
    'table1_name' VARCHAR(256),
     PRIMARY KEY ('table1_id')
)

CREATE `table2` (
    'table2_id' int(11) NOT NULL AUTO_INCREMENT,
    'table1_id' int(11) NOT NULL,
    'table1_name' VARCHAR(256),
     PRIMARY KEY ('table2_id'),
     FOREIGN KEY ('table1_id') REFERENCES 'table1' ('table1_id')
)

I want to know the number of rows in table1 that are NOT referenced in table2, that can be done with:

SELECT COUNT(t1.table1_id) FROM table1 t1 
WHERE t1.table1_id NOT IN (SELECT t2.table1_id FROM table2 t2)

Is there a more efficient way of performing this query?

Bill Karwin · Accepted Answer

Upgrade to MySQL 5.6, which optimizes semi-joins against subqueries better.

See http://dev.mysql.com/doc/refman/5.6/en/subquery-optimization.html

Or else use an exclusion join:

SELECT COUNT(t1.table1_id) FROM table1 t1 
LEFT OUTER JOIN table2 t2 USING (table1_id)
WHERE t2.table1_id IS NULL

Also, make sure table2.table1_id has an index on it.

John Ruddell · Answer

try using EXISTS.. its generally more efficient than IN

SELECT COUNT(t1.table1_id) 
FROM table1 t1 
WHERE EXISTS
(   SELECT 1 
    FROM table2 t2
    WHERE t2.table1_id <=> t1.table1_id
)

you can do it with NOT EXISTS as well

SELECT COUNT(t1.table1_id) 
FROM table1 t1 
WHERE NOT EXISTS
(   SELECT 1 
    FROM table2 t2
    WHERE t2.table1_id = t1.table1_id
)

EXISTS is generally faster because the execution plan is once it finds a hit, it will quit searching since the condition has proved true. The problem with IN is it will collect all the results from the subquery before further processing... and that takes longer

As @billkarwin noted in the comments EXISTS is using a dependent subquery.. Here is the explain on my two queries and also the OP's query.. http://sqlfiddle.com/#!2/53199d/5

More efficient query than NOT IN (nested select)

Tags:

performance

sql

mysql

A.O.

2 Answers

Bill Karwin

John Ruddell

Recent Activity

Donate For Us

More efficient query than NOT IN (nested select)

Tags:

performance

sql

mysql

A.O.

2 Answers

Bill Karwin

John Ruddell

Related questions

Recent Activity

Donate For Us