Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Left Join losing rows after filtering

Tags:

sql

left-join

I have a multi-table join (only two shown in example) where I need to retain all rows from the base table. Obviously I use a LEFT JOIN to include all rows on the base table. Without the WHERE clause it works great – When a row doesn’t exist in the Right table the row from the Left table still shows, just with a 0 from the column in the Right table. The first two rows in the dataset are Labels from the Left table and Count of rows from the Right table, grouped by Label. All I want is a count of 0 when a label does not have a value from Table2 assigned.

Table1

Label | FK
----------
Blue  | 1
Red   | 2
Green | 3

Table2

Values | pk    | Date
---------------------------
Dog    | 1     | 02/02/2010
Cat    | 2     | 02/02/2010
Dog    | 1     | 02/02/2010
Cat    | 2     | 02/02/2010

Query:

SELECT 1.Label, COUNT(2.values)
FROM Table1 1
    LEFT JOIN Table2 2 ON 1.fk = 2.pk
GROUP BY 1.Label

Good Result Set - No filters

Blue  | 2
Red   | 2
Green | 0

Great!

My issue is that when I add filtering criteria to remove rows from the Right table the row is removed for my Left join rows (zeroing them out), the Left rows are dropped. I need the Left rows to remain even if their count is filtered down to zero.

SELECT 1.Label, COUNT(2.values)
FROM Table1 1
    LEFT JOIN Table2 2 ON 1.fk = 1.pk
WHERE 2.Date BETWEEN '1/1/2010' AND '12/31/2010'    
GROUP BY 1.Label

Bummer Result Set - After Filters

Blue | 2
Red  | 2

Dukes!

So, what the hell? Do I need to get a temp table with the filtered dataset THEN join it to the Left table? What am I missing? Thanks!

Do a second join or recursive join. Get my “good” join table, get a second “filtered” table, then LEFT JOIN them

like image 827
BClaydon Avatar asked Sep 11 '13 20:09

BClaydon


People also ask

Does LEFT join remove rows?

The left join clause is used to select all rows from the left table, even if they have or don't have matching in the right table. Delete left join table is used to delete rows from the left table that do not have matching records in the right table.

Does LEFT join return duplicate rows?

Again, if we perform a left outer join where date = date, each row from Table 5 will join on to every matching row from Table 4. However, in this case, the join will result in 4 rows of duplicate dates in the joined DataSet (see Table 6).

How many rows does LEFT join return?

The SQL LEFT JOIN returns all rows from the left table, even if there are no matches in the right table. This means that if the ON clause matches 0 (zero) records in the right table; the join will still return a row in the result, but with NULL in each column from the right table.

Does LEFT join increases number of rows?

There are two line items for ID 1003 in the second table, so the result of the join will be 2 line items. So, if your secondary tables have more than one row for the key you're joining with, then the result of the join will be multiple rows, resulting in more rows than the left table.


3 Answers

You are filtering on the second table in the where. The values could be NULL and NULL fails the comparisons.

Move the where condition to the on clause:

SELECT 1.Label, COUNT(2.values)
FROM Table1 1 LEFT JOIN
     Table2 2
     ON 1.fk = 1.pk AND
        2.Date BETWEEN 1/1/2010 AND 12/31/2010    
GROUP BY 1.Label

Note:

The date formats retain the dates from the question. However, I don't advocate using BETWEEN for dates and the conditions should use standard date formats:

SELECT 1.Label, COUNT(2.values)
FROM Table1 1 LEFT JOIN
     Table2 2
     ON 1.fk = 1.pk AND
        2.Date >= '2010-01-01' AND
        2.Date < '2011-01-01'
GROUP BY 1.Label;

Some databases support the SQL Standard keyword DATE to identify date constants.

like image 122
Gordon Linoff Avatar answered Sep 21 '22 05:09

Gordon Linoff


simply move the condition in WHERE clause to the ON clause.

LEFT JOIN Table2 2 ON 1.fk = 1.pk AND 2.Date BETWEEN '1/1/2010' AND '12/31/2010'
like image 35
John Woo Avatar answered Sep 19 '22 05:09

John Woo


First of all i don't think it is a good idea to have number as an alias. Use a character or a word instead.

I would put the criteria in the join.

SELECT T1.Label, COUNT(T2.values)
FROM Table1 T1
    LEFT JOIN Table2 T2 ON T1.fk = T1.pk
    AND T2.Date BETWEEN '20100101' AND '20101231'    
GROUP BY T1.Label

Other comments:

  1. I would use ANSI(yyyyMMdd) format for the dates, so there is no missunderstanding.
  2. I would be aware of the BETWEEN as it is not clear what you want to do if the date is '20101231 01:00' Would you like to include that date or not? >= and <= or >= and < is clearer.
like image 1
Giannis Paraskevopoulos Avatar answered Sep 20 '22 05:09

Giannis Paraskevopoulos