As an example, I want to get the list of all items with certain tags applied to them. I could do either of the following:
SELECT Item.ID, Item.Name
FROM Item
WHERE Item.ID IN (
SELECT ItemTag.ItemID
FROM ItemTag
WHERE ItemTag.TagID = 57 OR ItemTag.TagID = 55)
Or
SELECT Item.ID, Item.Name
FROM Item
LEFT JOIN ItemTag ON ItemTag.ItemID = Item.ID
WHERE ItemTag.TagID = 57 OR ItemTag.TagID = 55
GROUP BY Item.ID, Item.Name
Or something entirely different.
In general (assuming there is a general rule), what's a more efficient approach?
SELECT Item.ID, Item.Name FROM Item WHERE Item.ID IN ( SELECT ItemTag.ItemID FROM ItemTag WHERE ItemTag.TagID = 57 OR ItemTag.TagID = 55)or
SELECT Item.ID, Item.Name FROM Item LEFT JOIN ItemTag ON ItemTag.ItemID = Item.ID WHERE ItemTag.TagID = 57 OR ItemTag.TagID = 55 GROUP BY Item.ID
Your second query won't compile, since it references Item.Name without either grouping or aggregating on it.
If we remove GROUP BY from the query:
SELECT Item.ID, Item.Name
FROM Item
JOIN ItemTag
ON ItemTag.ItemID = Item.ID
WHERE ItemTag.TagID = 57 OR ItemTag.TagID = 55
these are still different queries, unless ItemTag.ItemId is a UNIQUE key and marked as such.
SQL Server is able to detect an IN condition on a UNIQUE column, and will just transform the IN condition into a JOIN.
If ItemTag.ItemID is not UNIQUE, the first query will use a kind of a SEMI JOIN algorithm, which are quite efficient in SQL Server.
You can trasform the second query into a JOIN:
SELECT Item.ID, Item.Name
FROM Item
JOIN (
SELECT DISTINCT ItemID
FROMT ItemTag
WHERE ItemTag.TagID = 57 OR ItemTag.TagID = 55
) tags
ON tags.ItemID = Item.ID
but this one is a trifle less efficient than IN or EXISTS.
See this article in my blog for a more detailed performance comparison:
IN vs. JOIN vs. EXISTSIf you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With