The problem is to return the rows which contain nulls as well. Below is SQL code to create table and populate it with sample data.
I'm expecting below, but query does not show the two rows with null values.
src_t1 id1_t1 id2_t1 val_t1 src_t2 id1_t2 id2_t2 val_t2 b z z 4 a w w 100 b w w 1 a x x 200 b x x 2 a y y 300
Data:
CREATE TABLE sample (
src VARCHAR(6)
,id1 VARCHAR(6)
,id2 VARCHAR(6)
,val FLOAT
);
INSERT INTO sample (src, id1, id2, val)
VALUES ('a', 'w', 'w', 100)
,('b', 'w', 'w', 1)
,('a', 'x', 'x', 200)
,('b', 'x', 'x', 2)
,('a', 'y', 'y', 300)
,('b', 'z', 'z', 4)
;
This is my test query. It does not show results when t1.src = 'a' and t1.id1 = 'y' or when t2.src = 'b' and t2.id1 = 'z'.
Why?
What's the correct query?
SELECT t1.src, t1.id1, t1.id2, t1.val
,t2.src as src2, t2.id1, t2.id2, t2.val
FROM sample t1 FULL OUTER JOIN sample t2
ON t1.id1 = t2.id1 AND t1.id2 = t2.id2
WHERE (t1.src = 'a' AND t2.src = 'b')
OR (t1.src IS NULL AND t1.id1 IS NULL AND t1.id2 IS NULL)
OR (t2.src IS NULL AND t2.id1 IS NULL AND t2.id2 IS NULL)
I've also tried moving the conditions in the WHERE
clause to the ON
clause as well.
TIA.
From what you are saying, the 2 tables you are comparing are more or less the same, and full outer join giving you records from both tables, chances are you are going to get a lot of duplicates. So, that's the logic behind it.
Select column values in a specific order within rows to make rows with duplicate sets of values identical. Then you can use SELECT DISTINCT to remove duplicates. Alternatively, retrieve rows in such a way that near-duplicates are not even selected.
Avoiding Duplicates Again, if we perform a left outer join where date = date, each row from Table 5 will join on to every matching row from Table 4. However, in this case, the join will result in 4 rows of duplicate dates in the joined DataSet (see Table 6).
Check for Duplicates in Multiple Tables With INNER JOINUse the INNER JOIN function to find duplicates that exist in multiple tables. Sample syntax for an INNER JOIN function looks like this: SELECT column_name FROM table1 INNER JOIN table2 ON table1. column_name = table2.
The WHERE clause evaluates too late, effectively converting your query into an inner join.
Instead, write your query like this using proper JOIN syntax:
SELECT t1.src, t1.id1, t1.id2, t1.val
,t2.src as src2, t2.id1, t2.id2, t2.val
FROM (
select * from sample
where src='a'
) t1 FULL OUTER JOIN (
select * from sample
where src='b'
) t2
ON t1.id1 = t2.id1 AND t1.id2 = t2.id2
yielding this result set:
src id1 id2 val src2 id1 id2 val
---- ---- ---- ----------- ---- ---- ---- -----------
a w w 100 b w w 1
a x x 200 b x x 2
NULL NULL NULL NULL b z z 4
a y y 300 NULL NULL NULL NULL
Update:
Note also the use of two sub-queries to clearly separate the source table into two distinct relvars. I missed this for a minute on my first submission.
Actually, I think the solution is a bit cleaner if a CTE is used:
WITH A AS (
select * from sample where src='a'
),
B AS (
select * from sample where src='b'
)
SELECT *
FROM A FULL OUTER JOIN B
ON A.ID1 = B.ID1 AND A.ID2 = B.ID2
;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With