I'm having a little trouble about using multiple Left Joins in a query. Some of the tables have one-to-one relationship with the left-table and some have one-to-many relation. The query looks like this:
Select
files.filename,
coalesce(count(distinct case
when dm_data.weather like '%clear%' then 1
end),
0) as clear,
coalesce(count(distinct case
when dm_data.weather like '%lightRain%' then 1
end),
0) as lightRain,
coalesce(count(case
when kc_data.type like '%bicycle%' then 1
end),
0) as bicycle,
coalesce(count(case
when kc_data.type like '%bus%' then 1
end),
0) as bus,
coalesce(count(case
when kpo_data.movement like '%walking%' then 1
end),
0) as walking,
coalesce(count(case
when kpo_data.type like '%pedestrian%' then 1
end),
0) as pedestrian
from
files
left join
dm_data ON dm_data.id = files.id
left join
kc_data ON kc_data.id = files.id
left join
kpo_data ON kpo_data.id = files.id
where
files.filename in (X, Y, Z, ........)
group by files.filename;
Here, dm_data table has a one-to-one relation with 'files' table (thats why I'm using 'Distinct'), whereas kc_data and kpo_data data has one-to-many relationship with the 'files' table. (kc_data and kpo_data can have 10 to 20 rows against one files.id). This query works fine.
The problem arises when I add another left join with another one-to-many table pd_markings (which can have 100s of rows against one files.id).
Select
files.filename,
coalesce(count(distinct case
when dm_data.weather like '%clear%' then 1
end),
0) as clear,
coalesce(count(distinct case
when dm_data.weather like '%lightRain%' then 1
end),
0) as lightRain,
coalesce(count(case
when kc_data.type like '%bicycle%' then 1
end),
0) as bicycle,
coalesce(count(case
when kc_data.type like '%bus%' then 1
end),
0) as bus,
coalesce(count(case
when kpo_data.movement like '%walking%' then 1
end),
0) as walking,
coalesce(count(case
when kpo_data.type like '%pedestrian%' then 1
end),
0) as pedestrian,
**coalesce(count(case
when pd_markings.movement like '%walking%' then 1
end),
0) as walking**
from
files
left join
dm_data ON dm_data.id = files.id
left join
kc_data ON kc_data.id = files.id
left join
kpo_data ON kpo_data.id = files.id
left join
**kpo_data ON pd_markings.id = files.id**
where
files.filename in (X, Y, Z, ........)
group by files.filename;
Now all the values become multiple of each other. Any ideas???
Note that the first two columns return 1 or 0 value. Thats the desired result actually, as one-to-one relationship tables will only have either 1 or 0 rows against any files.id, so if I don't use 'Distinct' then the resulting value is wrong (i guess because of the other tables which are returning more then one row against same file.id) No, unfortunately, my tables don't have their own unique ID columns except the 'files' table.
You need to flatten the results of your query, in order to obtain a right count.
You said you have one-to-many relationship from your files table to other table(s)
If SQL only has a keyword LOOKUP
instead of cramming everything in JOIN
keywords, it shall be easy to infer if the relation between table A and table B is one-to-one, using JOIN
will automatically connotes one-to-many. I digress. Anyway, I should have already inferred that your files is one-to-many against dm_data; and also, the files against kc_data is one-to-many too. LEFT JOIN
is another hint that the relationship between first table and second table is one-to-many; this is not definitive though, some coders just write everything with LEFT JOIN
. There's nothing wrong with your LEFT JOIN in your query, but if there are multiple one-to-many tables in your query, that will surely fail, your query will produce repeating rows against other rows.
from
files
left join
dm_data ON dm_data.id = files.id
left join
kc_data ON kc_data.id = files.id
So with this knowledge that you indicate files is one-to-many against dm_data, and it is one-to-many also against kc_data. We can conclude that there's something wrong with chaining those joins and grouping them on one monolithic query.
An example if you have three tables, namely app(files), ios_app(dm_data), android_app(kc_data), and this is the data for example for ios:
test=# select * from ios_app order by app_code, date_released;
ios_app_id | app_code | date_released | price
------------+----------+---------------+--------
1 | AB | 2010-01-01 | 1.0000
3 | AB | 2010-01-03 | 3.0000
4 | AB | 2010-01-04 | 4.0000
2 | TR | 2010-01-02 | 2.0000
5 | TR | 2010-01-05 | 5.0000
(5 rows)
And this is the data for your android:
test=# select * from android_app order by app_code, date_released;
.android_app_id | app_code | date_released | price
----------------+----------+---------------+---------
1 | AB | 2010-01-06 | 6.0000
2 | AB | 2010-01-07 | 7.0000
7 | MK | 2010-01-07 | 7.0000
3 | TR | 2010-01-08 | 8.0000
4 | TR | 2010-01-09 | 9.0000
5 | TR | 2010-01-10 | 10.0000
6 | TR | 2010-01-11 | 11.0000
(7 rows)
If you merely use this query:
select x.app_code,
count(i.date_released) as ios_release_count,
count(a.date_released) as android_release_count
from app x
left join ios_app i on i.app_code = x.app_code
left join android_app a on a.app_code = x.app_code
group by x.app_code
order by x.app_code
The output will be wrong instead:
app_code | ios_release_count | android_release_count
----------+-------------------+-----------------------
AB | 6 | 6
MK | 0 | 1
PM | 0 | 0
TR | 8 | 8
(4 rows)
You can think of chained joins as cartesian product, so if you have 3 rows on first table, and has 2 rows on second table, the output will be 6
Here's the visualization, see that there is 2 repeating android AB for every ios AB. There are 3 ios AB, so what would be the count when you do COUNT(ios_app.date_released)? That will become 6; the same with COUNT(android_app.date_released)
, this will also be 6. Likewise there's 4 repeating android TR for every ios TR, there are are 2 TR in ios, so that would give us a count of 8.
.app_code | ios_release_date | android_release_date
----------+------------------+----------------------
AB | 2010-01-01 | 2010-01-06
AB | 2010-01-01 | 2010-01-07
AB | 2010-01-03 | 2010-01-06
AB | 2010-01-03 | 2010-01-07
AB | 2010-01-04 | 2010-01-06
AB | 2010-01-04 | 2010-01-07
MK | | 2010-01-07
PM | |
TR | 2010-01-02 | 2010-01-08
TR | 2010-01-02 | 2010-01-09
TR | 2010-01-02 | 2010-01-10
TR | 2010-01-02 | 2010-01-11
TR | 2010-01-05 | 2010-01-08
TR | 2010-01-05 | 2010-01-09
TR | 2010-01-05 | 2010-01-10
TR | 2010-01-05 | 2010-01-11
(16 rows)
So what you should do is flatten each result before you join them to other tables and queries.
If your database is capable of CTE, please use so. It's very neat and very self-documenting:
with ios_app_release_count_list as
(
select app_code, count(date_released) as ios_release_count
from ios_app
group by app_code
)
,android_release_count_list as
(
select app_code, count(date_released) as android_release_count
from android_app
group by app_code
)
select
x.app_code,
coalesce(i.ios_release_count,0) as ios_release_count,
coalesce(a.android_release_count,0) as android_release_count
from app x
left join ios_app_release_count_list i on i.app_code = x.app_code
left join android_release_count_list a on a.app_code = x.app_code
order by x.app_code;
Whereas if your database has no CTE capability yet, like MySQL, you should do this instead:
select x.app_code,
coalesce(i.ios_release_count,0) as ios_release_count,
coalesce(a.android_release_count,0) as android_release_count
from app x
left join
(
select app_code, count(date_released) as ios_release_count
from ios_app
group by app_code
) i on i.app_code = x.app_code
left join
(
select app_code, count(date_released) as android_release_count
from android_app
group by app_code
) a on a.app_code = x.app_code
order by x.app_code
That query and the CTE-style query will show the correct output:
app_code | ios_release_count | android_release_count
----------+-------------------+-----------------------
AB | 3 | 2
MK | 0 | 1
PM | 0 | 0
TR | 2 | 4
(4 rows)
Live test
Incorrect query: http://www.sqlfiddle.com/#!2/9774a/2
Correct query: http://www.sqlfiddle.com/#!2/9774a/1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With