Transact-SQL - sub query or left-join?

Question

I have two tables containing Tasks and Notes, and want to retrieve a list of tasks with the number of associated notes for each one. These two queries do the job:

select t.TaskId,
       (select count(n.TaskNoteId) from TaskNote n where n.TaskId = t.TaskId) 'Notes'
from   Task t

-- or
select t.TaskId,
       count(n.TaskNoteId) 'Notes'
from   Task t
left join
       TaskNote n
on     t.TaskId = n.TaskId
group by t.TaskId

Is there a difference between them and should I be using one over the other, or are they just two ways of doing the same job? Thanks.

Austin Salonen · Accepted Answer

On small datasets they are wash when it comes to performance. When indexed, the LOJ is a little better.

I've found on large datasets that an inner join (an inner join will work too.) will outperform the subquery by a very large factor (sorry, no numbers).

Cade Roux · Answer

In most cases, the optimizer will treat them the same.

I tend to prefer the second, because it has less nesting, which makes it easier to read and easier to maintain. I have started to use SQL Server's common table expressions to reduce nesting as well for the same reason.

In addition, the second syntax is more flexible if there are further aggregates which may be added in the future in addition to COUNT, like MIN(some_scalar), MAX(), AVG() etc.

ConcernedOfTunbridgeWells · Answer

The subquery will be slower as it is being executed for every row in the outer query. The join will be faster as it is done once. I believe that the query optimiser will not rewrite this query plan as it can't recognize the equivalence.

Normally you would do a join and group by for this sort of count. Correlated subqueries of the sort you show are mainly of interest if they have to do some grouping or more complex predicate on a table that is not participating in another join.

Simon Forrest · Answer

If you're using SQL Server Management Studio, you can enter both versions into the Query Editor and then right-click and choose Display Estimated Execution Plan. It will give you two percentage costs relative to the batch. If they're expected to take the same time, they'll both show as 50% - in which case, choose whichever you prefer for other reasons (easier to read, easier to maintain, better fit with your coding standards etc). Otherwise, you can pick the one with the lower percentage cost relative to the batch.

You can use the same technique to look at changing any query to improve performance by comparing two versions that do the same thing.

Of course, because it's a cost relative to the batch, it doesn't mean that either query is as fast as it could be - it just tells you how they compare to each other, not to some notional optimum query to get the same results.

Transact-SQL - sub query or left-join?

Tags:

sql

tsql

Nick

4 Answers

Austin Salonen

Cade Roux

ConcernedOfTunbridgeWells

Simon Forrest

Recent Activity

Donate For Us

Transact-SQL - sub query or left-join?

Tags:

sql

tsql

Nick

4 Answers

Austin Salonen

Cade Roux

ConcernedOfTunbridgeWells

Simon Forrest

Related questions

Recent Activity

Donate For Us