As an example, I want to get the list of all items with certain tags applied to them. I could do either of the following:
SELECT Item.ID, Item.Name
FROM Item
WHERE Item.ID IN (
SELECT ItemTag.ItemID
FROM ItemTag
WHERE ItemTag.TagID = 57 OR ItemTag.TagID = 55)
Or
SELECT Item.ID, Item.Name
FROM Item
LEFT JOIN ItemTag ON ItemTag.ItemID = Item.ID
WHERE ItemTag.TagID = 57 OR ItemTag.TagID = 55
GROUP BY Item.ID, Item.Name
Or something entirely different.
In general (assuming there is a general rule), what's a more efficient approach?
SELECT Item.ID, Item.Name FROM Item WHERE Item.ID IN ( SELECT ItemTag.ItemID FROM ItemTag WHERE ItemTag.TagID = 57 OR ItemTag.TagID = 55)
or
SELECT Item.ID, Item.Name FROM Item LEFT JOIN ItemTag ON ItemTag.ItemID = Item.ID WHERE ItemTag.TagID = 57 OR ItemTag.TagID = 55 GROUP BY Item.ID
Your second query won't compile, since it references Item.Name
without either grouping or aggregating on it.
If we remove GROUP BY
from the query:
SELECT Item.ID, Item.Name
FROM Item
JOIN ItemTag
ON ItemTag.ItemID = Item.ID
WHERE ItemTag.TagID = 57 OR ItemTag.TagID = 55
these are still different queries, unless ItemTag.ItemId
is a UNIQUE
key and marked as such.
SQL Server
is able to detect an IN
condition on a UNIQUE
column, and will just transform the IN
condition into a JOIN
.
If ItemTag.ItemID
is not UNIQUE
, the first query will use a kind of a SEMI JOIN
algorithm, which are quite efficient in SQL Server
.
You can trasform the second query into a JOIN
:
SELECT Item.ID, Item.Name
FROM Item
JOIN (
SELECT DISTINCT ItemID
FROMT ItemTag
WHERE ItemTag.TagID = 57 OR ItemTag.TagID = 55
) tags
ON tags.ItemID = Item.ID
but this one is a trifle less efficient than IN
or EXISTS
.
See this article in my blog for a more detailed performance comparison:
IN
vs. JOIN
vs. EXISTS
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With