Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LEFT JOIN returns same result as INNER JOIN

I have one table (scrubs) with 1600 unique items. A second table with 1 million plus. I run my INNER JOIN and get 65 matches:

SELECT a.`BW Parent Number` , a.`Vendor Name`, b.`Parent Supplier Name` 
FROM `scrubs` AS a
JOIN pdwspend AS b ON a.`BW Parent Number` = b.`Child Supplier ID`
WHERE a.`year` =2014
AND b.`BU ID` = 'BU_1'
AND b.version LIKE '%GOV%'
GROUP BY a.`BW Parent Number` 

Then I run a LEFT OUTER JOIN and I get the same, 65 results:

SELECT a.`BW Parent Number` , a.`Vendor Name`, b.`Parent Supplier Name`  
FROM `scrubs` AS a
LEFT OUTER JOIN pdwspend AS b ON a.`BW Parent Number` = b.`Child Supplier ID` 
WHERE a.`year` =2014
AND b.`BU ID` = 'BU_1'
AND b.version LIKE '%GOV%'
GROUP BY a.`BW Parent Number`

Why is not bringing all the rows from the left table and showing NULL for the ones that don't match under b.Parent Supplier Name ?

Thanks!

like image 383
nangys Avatar asked Sep 21 '14 00:09

nangys


People also ask

Can left and inner join returns the same results?

Both queries return exactly the same result. This is not by accident but the result of the fact that this is the same query written in two different ways. Both ways are correct, and you can use any of them.

Does LEFT join keep duplicates?

Again, if we perform a left outer join where date = date, each row from Table 5 will join on to every matching row from Table 4. However, in this case, the join will result in 4 rows of duplicate dates in the joined DataSet (see Table 6).

Which is more efficient left join or inner join?

IS LEFT join slower than join? The LEFT JOIN query is slower than the INNER JOIN query because it's doing more work.

What does a LEFT join return?

The LEFT JOIN command returns all rows from the left table, and the matching rows from the right table. The result is NULL from the right side, if there is no match.


2 Answers

Why is it not bringing all the rows from the left table and showing NULL for the ones that don't match?

The reason your LEFT JOIN didn't work as expected is because of the conditions on that table in the WHERE clause; this is sometimes called "implicit inner join" .

This is best explained by demonstration. Below are 2 simple tables

    | USER_ID | FIRST_NAME |  LAST_NAME |
    |---------|------------|------------|
    |     123 |       Fred | Flintstone |
    |     456 |     Barney |     Rubble |

    |    ID | USER_ID |       NOTE_BODY |
    |-------|---------|-----------------|
    | 98765 |     123 | Yabba Dabba Doo |

So as you would expect, an INNER JOIN produces just one row where the User_ID values match in both tables.

SELECT * FROM users u INNER JOIN user_notes n ON u.User_ID = n.User_ID;

    | USER_ID | FIRST_NAME |  LAST_NAME |    ID |       NOTE_BODY |
    |---------|------------|------------|-------|-----------------|
    |     123 |       Fred | Flintstone | 98765 | Yabba Dabba Doo |

And by changing to LEFT JOIN we get all records in Users but not all have information from User_Notes, so we get NULLs in those columns

SELECT * FROM users u LEFT JOIN user_notes n ON u.User_ID = n.User_ID;

    | USER_ID | FIRST_NAME |  LAST_NAME |     ID |       NOTE_BODY |
    |---------|------------|------------|--------|-----------------|
    |     123 |       Fred | Flintstone |  98765 | Yabba Dabba Doo |
    |     456 |     Barney |     Rubble | (null) |          (null) |

But what happens if we realy only want SOME records from the joined table?

SELECT * FROM users u LEFT JOIN user_notes n ON u.User_ID = n.User_ID WHERE n.Note_Body = 'Yabba Dabba Doo';

    | USER_ID | FIRST_NAME |  LAST_NAME |    ID |       NOTE_BODY |
    |---------|------------|------------|-------|-----------------|
    |     123 |       Fred | Flintstone | 98765 | Yabba Dabba Doo |

Well if we use a WHERE condition, the effect is the same as an INNER JOIN, now we don't get all User records and this is an implicit inner join.

The reason we don't get all user records is because we have now insisted that all result rows MUST have a certain value in a column that might be NULL. So, we could alter the WHERE condition to permit NULLs.

SELECT * FROM users u LEFT JOIN user_notes n ON u.User_ID = n.User_ID WHERE ( n.Note_Body = 'Yabba Dabba Doo' OR n.Note_Body IS NULL);

    | USER_ID | FIRST_NAME |  LAST_NAME |     ID |       NOTE_BODY |
    |---------|------------|------------|--------|-----------------|
    |     123 |       Fred | Flintstone |  98765 | Yabba Dabba Doo |
    |     456 |     Barney |     Rubble | (null) |          (null) |

OR

Instead of using the WHERE clause on the joined table we add to the join conditions (i.e.after ON )

SELECT * FROM users u LEFT JOIN user_notes n ON u.User_ID = n.User_ID AND n.Note_Body = 'Yabba Dabba Doo';

    | USER_ID | FIRST_NAME |  LAST_NAME |     ID |       NOTE_BODY |
    |---------|------------|------------|--------|-----------------|
    |     123 |       Fred | Flintstone |  98765 | Yabba Dabba Doo |
    |     456 |     Barney |     Rubble | (null) |          (null) |

So, be careful when using outer joins that your where clause does not override the NULLs that an outer join allows.

See the above as a SQLFiddle demonstration

like image 107
Paul Maxwell Avatar answered Nov 09 '22 20:11

Paul Maxwell


Because you are not using the on clause. Change it to:

SELECT a.`BW Parent Number`, a.`Vendor Name`, b.`Parent Supplier Name`
  FROM scrubs AS a
  LEFT OUTER JOIN pdwspend AS b
    ON a.`BW Parent Number` = b.`Child Supplier ID`
   and b.`BU ID` = 'BU_1'
   AND b.`version` LIKE '%GOV%'
 WHERE a.`year` = 2014

Also the group by doesn't make any sense. You would use the group by clause if you're aggregating on something.

Based on your comment regarding repeated rows, that is probably because you table called "pdwspend" has more than one row for each 'Child Supplier ID'. And that is the only field on that table with which you are joining the "scrubs" table. So yes, for every matching row on pdwspend, you'll have as many rows as there are on that second table (there are likely other columns on that table, so they really are not "repeated" rows, you're just not selecting enough columns for that to be illustrated).

Because you are only interested in a select number of columns and don't want rows 'repeated' based on those columns you can try distinct using:

(the reason you get an error in the query you put in your comments is because your inline view -- the subquery in your from clause -- does not select the 'Parent Supplier Name' field, so yes, it does not exist in that inline view because you didn't add it to the select list of that inline view.

     select a.`BW Parent Number`, a.`Vendor Name`, b.`Parent Supplier Name`
       from scrubs a
  left join ( select distinct `Child Supplier ID`, `Parent Supplier Name`
                         from pdwspend
                        where `BU ID` = 'BU_1'
                          and `version` LIKE '%GOV') b
         on a.`BW Parent Number` = b.`Child Supplier ID`
      where a.`year` = 2014
like image 40
Brian DeMilia Avatar answered Nov 09 '22 21:11

Brian DeMilia