Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL Inner Join VS Left Join w/ IS NOT NULL?

Tags:

mysql

Is there a performance difference in the following?

SELECT person.id
FROM person
LEFT JOIN address ON person.id = address.personID
WHERE address.personID IS NOT NULL

vs

SELECT person.id
FROM person
INNER JOIN address ON person.id = address.personID

This query should show all person ID's that have an address record (not all do). The logical thing to do here would be to use an Inner Join as the second example shows. For reasons that are not entirely important (the query is being generated from a query builder), I may have to use the first approach.

Curious what the impact is. Does MySQL do a lot of extra work when it LEFT JOIN's and then compares that field against null to reduce the set? Maybe that's how INNER JOIN works behind the scenes anyway?

like image 228
Brian Avatar asked Mar 21 '14 20:03

Brian


2 Answers

As demonstrated below, there is a potential for different execution plans for these two queries:

SELECT p.*, s.*
  FROM p
  LEFT
  JOIN s ON s.col = p.col
 WHERE s.col IS NOT NULL

SELECT p.*, s.*
  FROM p
 INNER
  JOIN s ON s.col = p.col

id select_type table  type poss key  key_len ref   rows Extra
-- ----------- ------ ---- ---- ---- ------- ----- ---- --------
 1 SIMPLE      p      ALL  -    -    -       -        3
 1 SIMPLE      s      ref  s_ix s_ix 9       p.col    1

id select_type table  type poss key  key_len ref   rows Extra
-- ----------- ------ ---- ---- ---- ------- ----- ---- -----------------------------
 1 SIMPLE      s      ALL  s_ix -    -       -        2
 1 SIMPLE      p      ALL  p_ix -    -       -        3 Using where; Using join buffer

So, we have to conclude that there could be a difference in performance. On small sets, the difference is going to be negligible. It's possible that large sets will show a significant difference in performance; we'd expect the INNER JOIN to be more efficient. It's entirely possible there's a test case that demonstrates better performance of the LEFT JOIN, but I haven't found it.

like image 73
spencer7593 Avatar answered Sep 27 '22 20:09

spencer7593


It may depend on the version of MySQL, because optimizer code is improved in each release. This might be a case where an older version does more work for the left outer join, resulting in a table-scan of person even if it would be more efficient to find a specific address and then do the join in the reverse direction.

@spencer7593 demonstrates a case where the two join types result in a different table ordering by the optimizer, which means that a left join forces the left table to be accessed first. (though in his example, the "using join buffer" indicates that there's no index for the join, so this could be an anomaly.)

But I have seen cases where the optimizer detects that the query is equivalent to an inner join, because you have conditions in the WHERE clause on the "outer" table. So it produces exactly the same optimization plan for a left outer join as it does for an inner join, and allows for table re-ordering.

like image 37
Bill Karwin Avatar answered Sep 27 '22 19:09

Bill Karwin