Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Left Join outperforming Inner Join?

I've been profiling some queries in an application I'm working on, and I came across a query that was retrieving more rows than necessary, the result set being trimmed down in the application code.

Changing a LEFT JOIN to an INNER JOIN trimmed the result set to just what was needed, and presumably would also be more performant (since less rows are selected). In reality, the LEFT JOIN'ed query was outperforming the INNER JOIN'ed, taking half the time to complete.

LEFT JOIN: (127 total rows, Query took 0.0011 sec)

INNER JOIN: (10 total rows, Query took 0.0024 sec)

(I ran the queries multiple times and those are averages).

Running EXPLAIN on both reveals nothing that explains the performance differences:

For the INNER JOIN:

id  select_type     table   type    possible_keys   key     key_len     ref        rows     Extra
1   SIMPLE  contacts        index       NULL        name        302     NULL         235    Using where
1   SIMPLE  lists           eq_ref      PRIMARY     PRIMARY     4   contacts.list_id     1   
1   SIMPLE  lists_to_users  eq_ref      PRIMARY     PRIMARY     8   lists.id,const  1    
1   SIMPLE  tags            eq_ref      PRIMARY     PRIMARY     4   lists_to_users.tag_id   1    
1   SIMPLE  users           eq_ref      email_2     email_2     302     contacts.email 1    Using where

For the LEFT JOIN:

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   SIMPLE          contacts index      NULL        name        302     NULL    235     Using where
1   SIMPLE        lists     eq_ref      PRIMARY     PRIMARY     4   contacts.list_id    1    
1   SIMPLE    lists_to_users eq_ref     PRIMARY     PRIMARY     8   lists.id,const  1    
1   SIMPLE         tags     eq_ref      PRIMARY     PRIMARY     4   lists_to_users.tag_id   1    
1   SIMPLE        users     eq_ref      email_2     email_2     302     contacts.email  1   

And the query itself:

SELECT `contacts`.*, `lists`.`name` AS `group`, `lists`.`id` AS `group_id`, `lists`.`shared_yn`, `tags`.`name` AS `context`, `tags`.`id` AS `context_id`, `tags`.`color` AS `context_color`, `users`.`id` AS `user_id`, `users`.`avatar` 
FROM `contacts`  
LEFT JOIN `lists` ON lists.id=contacts.list_id  
LEFT JOIN `lists_to_users` ON lists_to_users.list_id=lists.id AND lists_to_users.user_id='1' AND lists_to_users.creator='1'  
LEFT JOIN `tags` ON tags.id=lists_to_users.tag_id 
INNER JOIN `users` ON users.email=contacts.email 
WHERE (contacts.user_id='1') 
ORDER BY `contacts`.`name` ASC

(The clause that I'm talking about is the last INNER JOIN on the 'users' table)

The query runs on a MySQL 5.1 database, if it makes a difference.

Does anyone has a clue on why the LEFT JOIN'ed query outperforms the INNER JOIN'ed on in this case?

UPDATE: Due to Tomalak's suggestion that the small tables I'm using were making the INNER JOIN more complex, I'd created a test database with some mock data. The 'users' table is 5000 rows, and the contacts table is ~500,000 rows. The results are the same (also the timings haven't changed which is surprising when you consider that the tables are much bigger now).

I also ran ANALYZE and OPTIMIZE on the contacts table. Didn't make any discernible difference.

like image 744
Eran Galperin Avatar asked Oct 09 '08 06:10

Eran Galperin


2 Answers

If you think that the implementation of LEFT JOIN is INNER JOIN + more work, then this result is confusing. What if the implementation of INNER JOIN is (LEFT JOIN + filtering)? Ah, it is clear now.

In the query plans, the only difference is this: users... extra: using where . This means filtering. There's an extra filtering step in the query with the inner join.


This is a different kind of filtering than is typically used in a where clause. It is simple to create an index on A to support this filtering action.

SELECT *
FROM A
WHERE A.ID = 3

Consider this query:

SELECT *
FROM A
  LEFT JOIN B
  ON A.ID = B.ID
WHERE B.ID is not null

This query is equivalent to inner join. There is no index on B that will help that filtering action. The reason is that the where clause is stating a condition on the result of the join, instead of a condition on B.

like image 147
Amy B Avatar answered Sep 20 '22 17:09

Amy B


It's probably due to the INNER JOIN having to check each row in both tables to see if the column values (email in your case) match. The LEFT JOIN will return all from one table regardless. If it's indexed then it will know what to do faster too.

like image 33
HAdes Avatar answered Sep 20 '22 17:09

HAdes