Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server : does order of full outer join matter?

I have 4 full-outer joins in my query and its really slow, So does the order of FULL OUTER JOIN make a difference in performance / result ?

FULL OUTER JOIN = ⋈

Then,

I have a situation : A ⋈ B ⋈ C ⋈ D

All joins occur on a key common to all k contained in all A,B,C,D

Then:

  • Will changing the order of ⋈ joins make a difference to performance ?
  • Will changing the order of ⋈ change the result ?

I feel that it should not affect the result, but will it affect the performance or not I am not sure !

Update:

Will SQL Server automatically rearrange the joins for better performance assuming the result set will be independent of the order ?

like image 403
Yugal Jindle Avatar asked May 21 '12 11:05

Yugal Jindle


2 Answers

No, rearranging the JOIN orders should not affect the performance. MSSQL (as with other DBMS) has a query optimizer whose job it is to find the most efficient query plan for any given query. Generally, these do a pretty good job - so you're unlikely to beat the optimizer easily.

That said, they do get it wrong occasionally. That's where reading an execution plan comes into play. You can add JOIN hints to tell MSSQL how to join your tables (at which point, ordering does matter). You'd generally order from smallest to largest table (though, with a FULL JOIN, it's not likely to matter very much) and follow the rules of thumb for join types.

Since you're doing FULL JOINS, you're basically reading the entirety of 4 tables off disk. That's likely to be very expensive. You may want to re-examine the problem, and see if it can be accomplished in a different way.

like image 164
Mark Brackett Avatar answered Nov 05 '22 21:11

Mark Brackett


  • Will changing the order of ⋈ change the result ?

No, the order of the FULL JOIN does not matter, the result will be the same. Notice however, that you can't use something like this (the following may give different results depending on the order of joins):

SELECT 
    COALESCE(a.id, b.id, c.id, d.id) AS id,  --- Key columns used in FULL JOIN
    a.*, b.*, c.*, d.*                       --- other columns                 
FROM a 
  FULL JOIN b
      ON b.id = a.id
  FULL JOIN c
      ON c.id = a.id
  FULL JOIN d
      ON d.id = a.id ;

You have to use something like this (no difference in results whatever the order of joins):

SELECT 
    COALESCE(a.id, b.id, c.id, d.id) AS id,   
    a.*, b.*, c.*, d.*                                   
FROM a 
  FULL JOIN b
      ON b.id = a.id
  FULL JOIN c
      ON c.id = COALESCE(a.id, b.id) 
  FULL JOIN d
      ON d.id = COALESCE(a.id, b.id, c.id) ;

  • Will changing the order of ⋈ joins make a difference to performance?

Taking into consideration that the second and third joins have to be done on the COALESCE() of the columns and not the columns themselves, I think only testing with large enough tables will show if the indexes can be used effectively.

like image 4
ypercubeᵀᴹ Avatar answered Nov 05 '22 23:11

ypercubeᵀᴹ