This is a snippet from my stored Proc
SELECT NULL AS StoryID
, AlbumID
, CAST(NULL as varchar) AS StoryTitle
, AlbumName
, (SELECT URL FROM AlbumPictures AS AlbumPictures_3 WHERE (AlbumID = Albums.AlbumID) AND (AlbumCover = 'True')) AS AlbumCover
, Votes
, CAST(NULL as Int) AS PictureId
, 'albums' AS tableName
, (SELECT NestedAlbums.AlbumID FROM NestedAlbums WHERE (AlbumID = Albums.AlbumID)) AS Flag
INTO #Results2
FROM Albums WHERE AlbumID IN (SELECT StringVal FROM funcListToTableInt(@whereAlbumID))
I have used nested selects in my Query above. I am curious to know whether Nested Selects
are better than LEFT/Right JOINS
OR should i use JOINS
?
Table Albums:
Table NestedAlbums:
In general writing this an explicit OUTER JOIN
would be better.
SQL Server will probably need to add an Assert to the plan with the sub query version that verifies the sub query only returns at most one row (unless this is guaranteed by a unique index). This can limit the possible transformations available. See Scalar Subqueries for more about this.
Also (though not relevant to the example in your question as both sub queries are different) writing as an explicit JOIN
allows you to use multiple columns from the joined table with one lookup whereas using separate similar subqueries would not (SQL Server has no logic to detect the common sub expressions).
Edit:
Following discussion in comments something like
SELECT NULL AS StoryID,
A.AlbumID,
CAST(NULL AS VARCHAR(30)) AS StoryTitle,
A.AlbumName,
AP.URL AS AlbumCover,
A.Votes,
CAST(NULL AS INT) AS PictureId,
'albums' AS tableName,
CASE
WHEN EXISTS (SELECT *
FROM NestedAlbums NA
WHERE NA.AlbumID = A.AlbumID
AND ( AccountId = @AccountId )) THEN 1
ELSE 0
END AS Flag
INTO #Results2
FROM Albums A
LEFT OUTER JOIN AlbumPictures AP
ON ( AP.AlbumID = A.AlbumID )
AND ( AP.AlbumCover = 'True' )
WHERE A.AlbumID IN (SELECT StringVal
FROM funcListToTableInt(@whereAlbumID))
You may notice this still has a sub query in the SELECT
list but CASE ... EXISTS
will be implemented efficiently as a semi join.
At the moment your query assumes that at most one matching row per album will be returned from AlbumPictures
and would error out if this assumption is not true. This changes the semantics in that no error will be returned and you would get multiple rows with the various URL
s. If you didn't want that to happen you would need to define which URL
to use and add a GROUP BY
The first difference is, the FROM
clause governs initial cardinality of the results.
In your case, the results will have one row per row in Album. Scalar subqueries in the SELECT
clause cannot change this. If a subquery happens to return multiple rows, SQL Server will throw an exception. It will never add to the results.
When you move this logical join to the FROM
clause, you redefine your initial cardinality. It is no longer one row per row in Album, but one row per row in Album LEFT OUTER JOIN AblumPictures ON...
etc. If that yields multiple rows per row in Album, SQL Server will not throw an exception as it did for the subselect. Rather, it will add rows to the results.
So in this regard, the subquery does a better job expressing intent, and a better job proactively protecting you from data that violates that intent: "give me one row per Album, and for each album, include the URL for here, the nested ID from there" etc.
HOWEVER, functionally speaking, there is a huge drawback: scalar subqueries cannot return whole tuples. You've gone through all this work to write the subquery, SQL Server has gone through all this work to execute it, and now you are limited to this one single scalar return value! Sometimes that's fine, but sometimes you need more. When you need more, you need the FROM
clause.
The closest FROM
-clause equivalent to a scalar subquery is not OUTER JOIN
, but rather the wondrous OUTER APPLY
. OUTER APPLY
is not a scalar expression: it return whole tuples, and any number of rows.
First approximation:
SELECT Albums.*, AlbumPictures.URL, NestedAlbums.AlbumID
FROM Albums
OUTER APPLY (
SELECT TOP (1) * FROM AlbumPictures
WHERE (AlbumID = Albums.AlbumID) AND (AlbumCover = 'True')
) AlbumPictures
OUTER APPLY (
SELECT TOP (1) * FROM NestedAlbums
WHERE (AlbumID = Albums.AlbumID)
) NestedAlbums
WHERE Albums.AlbumID IN (SELECT StringVal FROM funcListToTableInt(@whereAlbumID))
So, by virtue of TOP (1)
, Albums still governs the initial cardinality of the results. However, we now have access to all the columns from the related tables, and this is awesome.
Then, if we are confident that TOP (1)
is not necessary--that by virtue of keys and indexes the subquery can only ever return one row--then we can rewrite using a simpler form:
SELECT Albums.*, AlbumPictures.URL, NestedAlbums.AlbumID
FROM Albums
OUTER APPLY (
SELECT * FROM AlbumPictures
WHERE (AlbumID = Albums.AlbumID) AND (AlbumCover = 'True')
) AlbumPictures
OUTER APPLY (
SELECT * FROM NestedAlbums
WHERE (AlbumID = Albums.AlbumID)
) NestedAlbums
WHERE Albums.AlbumID IN (SELECT StringVal FROM funcListToTableInt(@whereAlbumID))
Which is now logically equivalent to OUTER JOIN
:
SELECT Albums.*, AlbumPictures.URL, NestedAlbums.AlbumID
FROM Albums
LEFT OUTER JOIN AlbumPictures
ON AlbumPictures.AlbumID = Albums.AlbumID
AND AlbumPictures.AlbumCover = 'True'
LEFT OUTER JOIN NestedAlbums
ON NestedAlbums.AlbumID = Albums.AlbumID
WHERE Albums.AlbumID IN (SELECT StringVal FROM funcListToTableInt(@whereAlbumID))
And there you have it. Which is better? Well, whatever you do, please keep it simple.
PERFORMANCE WISE, generally, there is not a whole heck of a lot of difference between the forms. You can compare execution plans side-by-side for your particular tables and indexes. It's a good learning experience to see how SQL Server rewrites logically equivalent queries. I'd expect to see identical plans for OUTER APPLY
(w/o TOP (1)
) and LEFT OUTER JOIN
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With