Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL server join vs subquery performance question

I discovered that in some cases a query like

select 
   usertable.userid,
   (select top 1 name from nametable where userid = usertable.userid) as name 
from usertable 
where active = 1

takes an order of magnitude longer to complete in SS2008R2 than the equivalent join query

select 
   usertable.userid,
   nametable.name 
from usertable 
left join nametable on nametable.userid = usertable.userid 
where usertable.active = 1

where both tables are indexed and have over 100k rows. Interestingly, inserting a top clause into the original query makes it perform on par with the join query:

select 
    top (select count(*) from usertable where active = 1) usertable.userid,
    (select top 1 name from nametable where userid = usertable.userid) as name 
from usertable 
where active = 1

Does anyone have any idea why the original query performs so poorly?

like image 286
Dave Causey Avatar asked Sep 08 '11 14:09

Dave Causey


2 Answers

Well, the queries are different - unless the userid column is a primary key or has a uniqueness constraint then the second query could return more rows than the first.

That said, with the assumption that userid is a primary key / unique try removing the TOP 1 part of the first subquery:

select 
   usertable.userid,
   (select name from nametable where userid = usertable.userid) as name 
from usertable 
where active = 1
like image 169
Justin Avatar answered Oct 17 '22 17:10

Justin


It's a correlated subquery, which means it needs to execute once per return row of the outer query since it references a field in the outer query.

A JOIN runs once for the entire result set and gets merged. Your subquery runs the outer query, then for each returned row it runs the subquery again.

like image 26
JNK Avatar answered Oct 17 '22 18:10

JNK