I discovered that in some cases a query like
select
usertable.userid,
(select top 1 name from nametable where userid = usertable.userid) as name
from usertable
where active = 1
takes an order of magnitude longer to complete in SS2008R2 than the equivalent join query
select
usertable.userid,
nametable.name
from usertable
left join nametable on nametable.userid = usertable.userid
where usertable.active = 1
where both tables are indexed and have over 100k rows. Interestingly, inserting a top clause into the original query makes it perform on par with the join query:
select
top (select count(*) from usertable where active = 1) usertable.userid,
(select top 1 name from nametable where userid = usertable.userid) as name
from usertable
where active = 1
Does anyone have any idea why the original query performs so poorly?
Well, the queries are different - unless the userid
column is a primary key or has a uniqueness constraint then the second query could return more rows than the first.
That said, with the assumption that userid is a primary key / unique try removing the TOP 1
part of the first subquery:
select
usertable.userid,
(select name from nametable where userid = usertable.userid) as name
from usertable
where active = 1
It's a correlated subquery, which means it needs to execute once per return row of the outer query since it references a field in the outer query.
A JOIN
runs once for the entire result set and gets merged. Your subquery runs the outer query, then for each returned row it runs the subquery again.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With