Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which SQL query is faster? Filter on Join criteria or Where clause?

Compare these 2 queries. Is it faster to put the filter on the join criteria or in the WHERE clause. I have always felt that it is faster on the join criteria because it reduces the result set at the soonest possible moment, but I don't know for sure.

I'm going to build some tests to see, but I also wanted to get opinions on which would is clearer to read as well.

Query 1

SELECT      * FROM        TableA a INNER JOIN  TableXRef x         ON  a.ID = x.TableAID INNER JOIN  TableB b         ON  x.TableBID = b.ID WHERE       a.ID = 1            /* <-- Filter here? */ 

Query 2

SELECT      * FROM        TableA a INNER JOIN  TableXRef x         ON  a.ID = x.TableAID         AND a.ID = 1            /* <-- Or filter here? */ INNER JOIN  TableB b         ON  x.TableBID = b.ID 

EDIT

I ran some tests and the results show that it is actually very close, but the WHERE clause is actually slightly faster! =)

I absolutely agree that it makes more sense to apply the filter on the WHERE clause, I was just curious as to the performance implications.

ELAPSED TIME WHERE CRITERIA: 143016 ms
ELAPSED TIME JOIN CRITERIA: 143256 ms

TEST

SET NOCOUNT ON;  DECLARE @num    INT,         @iter   INT  SELECT  @num    = 1000, -- Number of records in TableA and TableB, the cross table is populated with a CROSS JOIN from A to B         @iter   = 1000  -- Number of select iterations to perform  DECLARE @a TABLE (         id INT )  DECLARE @b TABLE (         id INT )  DECLARE @x TABLE (         aid INT,         bid INT )  DECLARE @num_curr INT SELECT  @num_curr = 1          WHILE (@num_curr <= @num) BEGIN     INSERT @a (id) SELECT @num_curr     INSERT @b (id) SELECT @num_curr          SELECT @num_curr = @num_curr + 1 END  INSERT      @x (aid, bid) SELECT      a.id,             b.id FROM        @a a CROSS JOIN  @b b  /*     TEST */ DECLARE @begin_where    DATETIME,         @end_where      DATETIME,         @count_where    INT,         @begin_join     DATETIME,         @end_join       DATETIME,         @count_join     INT,         @curr           INT,         @aid            INT  DECLARE @temp TABLE (         curr    INT,         aid     INT,         bid     INT )  DELETE FROM @temp  SELECT  @curr   = 0,         @aid    = 50  SELECT  @begin_where = CURRENT_TIMESTAMP WHILE (@curr < @iter) BEGIN     INSERT      @temp (curr, aid, bid)     SELECT      @curr,                 aid,                 bid     FROM        @a a     INNER JOIN  @x x             ON  a.id = x.aid     INNER JOIN  @b b             ON  x.bid = b.id     WHERE       a.id = @aid              SELECT @curr = @curr + 1 END SELECT  @end_where = CURRENT_TIMESTAMP  SELECT  @count_where = COUNT(1) FROM @temp DELETE FROM @temp  SELECT  @curr = 0 SELECT  @begin_join = CURRENT_TIMESTAMP WHILE (@curr < @iter) BEGIN     INSERT      @temp (curr, aid, bid)     SELECT      @curr,                 aid,                 bid     FROM        @a a     INNER JOIN  @x x             ON  a.id = x.aid             AND a.id = @aid     INNER JOIN  @b b             ON  x.bid = b.id          SELECT @curr = @curr + 1 END SELECT  @end_join = CURRENT_TIMESTAMP  SELECT  @count_join = COUNT(1) FROM @temp DELETE FROM @temp  SELECT  @count_where AS count_where,         @count_join AS count_join,         DATEDIFF(millisecond, @begin_where, @end_where) AS elapsed_where,         DATEDIFF(millisecond, @begin_join, @end_join) AS elapsed_join 
like image 423
Jon Erickson Avatar asked Mar 24 '10 17:03

Jon Erickson


People also ask

Is it better to filter in join or WHERE clause?

In terms of readability though, especially in complex queries that have multiple joins, it is easier to spot join conditions when they are placed in the ON clause and filter conditions when they are placed in the WHERE clause.

Are joins faster than WHERE clause?

The subquery can be placed in the following SQL clauses they are WHERE clause, HAVING clause, FROM clause. Advantages Of Joins: The advantage of a join includes that it executes faster. The retrieval time of the query using joins almost always will be faster than that of a subquery.

Which is faster join in SQL?

If you dont include the items of the left joined table, in the select statement, the left join will be faster than the same query with inner join. If you do include the left joined table in the select statement, the inner join with the same query was equal or faster than the left join.

Which query is faster in SQL?

Use CASE instead of UPDATE UPDATE statement takes longer than CASE statement due to logging. On the other hand, CASE statement determines what needs to be updated and makes your SQL queries faster.


2 Answers

Performance-wise, they are the same (and produce the same plans)

Logically, you should make the operation that still has sense if you replace INNER JOIN with a LEFT JOIN.

In your very case this will look like this:

SELECT  * FROM    TableA a LEFT JOIN         TableXRef x ON      x.TableAID = a.ID         AND a.ID = 1 LEFT JOIN         TableB b ON      x.TableBID = b.ID 

or this:

SELECT  * FROM    TableA a LEFT JOIN         TableXRef x ON      x.TableAID = a.ID LEFT JOIN         TableB b ON      b.id = x.TableBID WHERE   a.id = 1 

The former query will not return any actual matches for a.id other than 1, so the latter syntax (with WHERE) is logically more consistent.

like image 103
Quassnoi Avatar answered Oct 07 '22 13:10

Quassnoi


For inner joins it doesn't matter where you put your criteria. The SQL compiler will transform both into an execution plan in which the filtering occurs below the join (ie. as if the filter expressions appears is in the join condition).

Outer joins are a different matter, since the place of the filter changes the semantics of the query.

like image 40
Remus Rusanu Avatar answered Oct 07 '22 13:10

Remus Rusanu