I have a Notes
table with a uniqueidentifier
column that I use as a FK for a variety of other tables in the database (don't worry, the uniqueidentifier
columns on the other tables aren't clustered PKs). These other tables represent something of a hierarchy of business objects. As a simple representation, let's say I have two other tables:
In the display of a Lead
in the application, I need to show all notes related to the lead, including those tagged to any Quote
that belongs to that lead. I have two options as far as I can see — either a UNION ALL
or several LEFT JOIN
statements. Here's how they'd look:
SELECT N.*
FROM Notes N
JOIN Leads L ON N.TargetUniqueID = L.UniqueID
WHERE L.LeadID = @LeadID
UNION ALL
SELECT N.*
FROM Notes N
JOIN Quotes Q ON N.TargetUniqueID = Q.UniqueID
WHERE Q.LeadID = @LeadID
Or...
SELECT N.*
FROM Notes N
LEFT JOIN Leads L ON N.TargetUniqueID = L.UniqueID
LEFT JOIN Quotes Q ON N.TargetUniqueID = Q.UniqueID
WHERE L.LeadID = @LeadID OR Q.LeadID = @LeadID
In real life I have a total of five tables that the notes could be attached to, and that number could grow as the application grows. I already have non-clustered indexes set up on the uniqueidentifier
columns I'm using, and SQL Profiler says I can't make any more improvements, but when I do a performance test on a realistically-sized test data set, I get the following numbers:
UNION ALL
— 0.010 secLEFT JOIN
— 0.744 secI had always heard that using UNION
was bad, and that UNION ALL
was only marginally better, but the performance numbers don't seem to bear that out. Granted, the UNION ALL
SQL code might be more of a pain to maintain, but at that kind of performance difference it's probably worth it.
So is UNION ALL
really better here or am I missing something on the LEFT JOIN
code that is slowing things down?
Union will be faster, as it simply passes the first SELECT statement, and then parses the second SELECT statement and adds the results to the end of the output table.
In case there are a large number of rows in the tables and there is an index to use, INNER JOIN is generally faster than OUTER JOIN. Generally, an OUTER JOIN is slower than an INNER JOIN as it needs to return more number of records when compared to INNER JOIN.
TLDR: The most efficient join is also the simplest join, 'Relational Algebra'. If you wish to find out more on all the methods of joins, read further. Relational algebra is the most common way of writing a query and also the most natural way to do so.
The advantage of a join includes that it executes faster. The retrieval time of the query using joins almost always will be faster than that of a subquery. By using joins, you can maximize the calculation burden on the database i.e., instead of multiple queries using one join query.
The UNION ALL
version would probably be satisfied quite easily by 2 index seeks. OR
can lead to scans. What do the execution plans look like?
Also have you tried this to avoid accessing Notes
twice?
;WITH J AS
(
SELECT UniqueID FROM Leads WHERE LeadID = @LeadID
UNION ALL
SELECT UniqueID FROM Quotes WHERE LeadID = @LeadID
)
SELECT N.* /*Don't use * though!*/
FROM Notes N
JOIN J ON N.TargetUniqueID = J.UniqueID
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With