Below is a SQL statement inside a stored procedure (truncated for brevity): <pre class="prettyprint"><code>SELECT * FROM item a WHERE a.orderId NOT IN (SELECT orderId FROM table_excluded_item); </code></pre> This statement takes 30 seconds or so! But if I remove the inner SELECT query, it drops to 1s. <code>table_excluded_item</code> is not huge, but I suspect the inner query is being executed more than it needs to be. Is there a more efficient way of doing this?

The problem with the left join approach is that duplicate records might be processed in generating the output. Sometimes, this is not the case . . . according to this article, MySQL does optimize the <code>left outer join</code> correctly when the columns are indexed, even in the presence of duplicates. I admit to remaining skeptical, though, that this optimization always happens. MySQL sometimes has problems optimizing <code>IN</code> statements with a subquery. The best fix is a correlated subquery: <pre class="prettyprint"><code>SELECT * FROM item a WHERE not exists (select 1 from table_excluded_item tei where tei.orderid = a.orderid limit 1 ) </code></pre> If you have an index on table_excluded_item.orderid, then this will scan the index and stop at the first value (the <code>limit 1</code> may not be strictly necessary for this). This is the fastest and safest way to implement what you want in MySQL.

MySQL WHERE NOT IN extremely slow

Tags:

sql

mysql

query-optimization

Below is a SQL statement inside a stored procedure (truncated for brevity):

SELECT * 
FROM item a 
WHERE a.orderId NOT IN (SELECT orderId FROM table_excluded_item);

This statement takes 30 seconds or so! But if I remove the inner SELECT query, it drops to 1s. table_excluded_item is not huge, but I suspect the inner query is being executed more than it needs to be.

Is there a more efficient way of doing this?

280

asked Jan 05 '13 02:01

pixelfreak

Video Answer

2 Answers

use LEFT JOIN

SELECT  a.* 
FROM    item a 
        LEFT JOIN table_excluded_item b
            ON a.orderId = b.orderId
WHERE   b.orderId IS NULL

make sure that orderId from both tables has been indexed.

141

answered Sep 18 '22 12:09

John Woo

The problem with the left join approach is that duplicate records might be processed in generating the output. Sometimes, this is not the case . . . according to this article, MySQL does optimize the left outer join correctly when the columns are indexed, even in the presence of duplicates. I admit to remaining skeptical, though, that this optimization always happens.

MySQL sometimes has problems optimizing IN statements with a subquery. The best fix is a correlated subquery:

SELECT * 
FROM item a 
WHERE not exists (select 1
                  from table_excluded_item tei
                  where tei.orderid = a.orderid
                  limit 1
                 )

If you have an index on table_excluded_item.orderid, then this will scan the index and stop at the first value (the limit 1 may not be strictly necessary for this). This is the fastest and safest way to implement what you want in MySQL.

answered Sep 19 '22 12:09

Gordon Linoff

Related questions
                            
                                Using the $wpdb in wordpress to run SQL commands
                            
                                Why does connection to my MySQL server in Azure fail if my app does not have SSL enabled?
                            
                                Mysqldump more than one table?
                            
                                MySql.Data.MySqlClient.MySqlException: Timeout expired
                            
                                Make a query in mysql without invoking a trigger (How to disable a trigger)
                            
                                Best way to store XML data in a MySQL database, with some specific requirements
                            
                                How to select single row based on the max value in multiple rows [duplicate]
                            
                                Creating a "Numbers Table" in MySQL
                            
                                MySQL - Combining two select statements into one result with LIMIT efficiently
                            
                                get table prefix
                            
                                PreparedStatement caching - what does it mean (how does it work)
                            
                                Deadlock issue when transaction tries to accuire a lock it's already holding
                            
                                Mysql::Error: Specified key was too long; max key length is 767 bytes: CREATE INDEX
                            
                                laravel eloquent ignore error when inserting a duplicate key
                            
                                Scaling Drupal [closed]
                            
                                MySQL DATE_ADD usage, 5 day interval
                            
                                How to select value number of ENUM types in MySql?
                            
                                Which one is faster: correlated subqueries or join?
                            
                                difference between collection and association mapping in mybatis 3
                            
                                Mysql : how to calculate business hrs between two timestamps and neglect weekends

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With