Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance of two left joins versus union

I have searched but have not found a definitive answer. Which of these is better for performance in SQL Server:

SELECT T.*
FROM dbo.Table1 T
    LEFT JOIN Table2 T2 ON T.ID = T2.Table1ID
    LEFT JOIN Table3 T3 ON T.ID = T3.Table1ID
WHERE T2.Table1ID IS NOT NULL
    OR T3.Table1ID IS NOT NULL

or...

SELECT T.*
FROM dbo.Table1 T
    JOIN Table2 T2 ON T.ID = T2.Table1ID
UNION
SELECT T.*
FROM dbo.Table1 T
    JOIN Table3 T3 ON T.ID = T3.Table1ID

I have tried running both but it's hard to tell for sure. I'd appreciate an explanation of why one is faster than the other, or if it depends on the situation.

like image 487
mayabelle Avatar asked Oct 01 '22 07:10

mayabelle


1 Answers

Your two queries do not do the same things. In particular, the first will return duplicate rows if values are duplicated in either table.

If you are looking for rows in Table1 that are in either of the other two tables, I would suggest using exists:

select t1.*
from Table1 t1
where exists (select 1 from Table2 t2 where t2.Table1Id = t1.id) or
      exists (select 1 from Table3 t3 where t3.Table1Id = t1.id);

And, create indexes on Table1Id in both Table2 and Table3.

Which of your original queries is faster depends a lot on the data. The second has an extra step to remove duplicates (union verses union all). On the other hand, the first might end up creating many duplicate rows.

like image 174
Gordon Linoff Avatar answered Oct 25 '22 07:10

Gordon Linoff