Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL "IN" queries terribly slow with subquery but fast with explicit values

I have a MySQL query (Ubu 10.04,Innodb, Core i7, 16Gb RAM, SSD drives, MySQL params optimized):

SELECT COUNT(DISTINCT subscriberid) FROM em_link_data WHERE linkid in (SELECT l.id FROM em_link l WHERE l.campaignid = '2900' AND l.link != 'open') 

The table em_link_data has about 7million rows, em_link has a few thousand. This query will take about 18 seconds to complete. However, if I substitute the results of the subquery and do this:

SELECT COUNT(DISTINCT subscriberid) FROM em_link_data WHERE linkid in (24899,24900,24901,24902); 

then the query will run in less than 1 millisecond. The subquery alone runs in less than 1ms, the column linkid is indexed.

If I rewrite the query as a join, also less than 1ms. Why is a "IN" query so slow with a subquery in it and why so fast with values in it? I can't rewrite the query (bought software) so I was hoping there is some tweak or hint to speedup this query! Any help is appreciated.

like image 267
Franco Avatar asked Feb 16 '11 15:02

Franco


People also ask

Why is subquery slow?

For multiple-table subqueries, execution of NULL IN (SELECT ...) is particularly slow because the join optimizer does not optimize for the case where the outer expression is NULL .

Why is my MySQL query running slow?

Queries can become slow for various reasons ranging from improper index usage to bugs in the storage engine itself. However, in most cases, queries become slow because developers or MySQL database administrators neglect to monitor them and keep an eye on their performance.


2 Answers

Subqueries execute every time you evaluate them (in MySQL anyway, not all RDBMSes), i.e. you're basically running 7 million queries! Using a JOIN, if possible, will reduce this to 1. Even if adding indexing improves performance of those, you're still running them.

like image 141
Brian Avatar answered Sep 19 '22 23:09

Brian


Yes, IN with subqueries is slow. Use a join instead.

SELECT COUNT(DISTINCT subscriberid) FROM em_link_data JOIN em_link ON em_link_data.linkid=em_link.id WHERE em_link.campaignid = '2900' AND em_link.link != 'open' 

And make sure you've defined indexes on em_link_data.linkid and em_link.id.

like image 36
awm Avatar answered Sep 21 '22 23:09

awm