I have four tables:
╔═══╦════╗ ╔═══╦════╗ ╔═══╦════╗ ╔═══╦════╗
║ K ║ V1 ║ ║ K ║ V2 ║ ║ K ║ V3 ║ ║ K ║ V4 ║
╠═══╬════╣ ╠═══╬════╣ ╠═══╬════╣ ╠═══╬════╣
║ A ║ 1 ║ ║ B ║ 2 ║ ║ B ║ 3 ║ ║ C ║ 6 ║
║ C ║ 4 ║ ║ D ║ 7 ║ ║ C ║ 5 ║ ║ D ║ 8 ║
╚═══╩════╝ ╚═══╩════╝ ╚═══╩════╝ ╚═══╩════╝
I need to join them by the column K and select V1, V2, V3 and V4:
╔═══╦══════╦══════╦══════╦══════╗
║ K ║ V1 ║ V2 ║ V3 ║ V4 ║
╠═══╬══════╬══════╬══════╬══════╣
║ A ║ 1 ║ NULL ║ NULL ║ NULL ║
║ B ║ NULL ║ 2 ║ 3 ║ NULL ║
║ C ║ 4 ║ NULL ║ 5 ║ 6 ║
║ D ║ NULL ║ 7 ║ NULL ║ 8 ║
╚═══╩══════╩══════╩══════╩══════╝
One way to solve it is this:
DECLARE @T1 TABLE (K VARCHAR(1), V1 INT);
DECLARE @T2 TABLE (K VARCHAR(1), V2 INT);
DECLARE @T3 TABLE (K VARCHAR(1), V3 INT);
DECLARE @T4 TABLE (K VARCHAR(1), V4 INT);
INSERT INTO @T1 VALUES ('A', 1), ('C', 4);
INSERT INTO @T2 VALUES ('B', 2), ('D', 7);
INSERT INTO @T3 VALUES ('B', 3), ('C', 5);
INSERT INTO @T4 VALUES ('C', 6), ('D', 8);
SELECT COALESCE(t1.K, t2.K, t3.K, t4.K) [K], V1, V2, V3, V4
FROM @T1 t1
FULL JOIN @T2 t2 ON t2.K = t1.K
FULL JOIN @T3 t3 ON t3.K = t1.K OR t3.K = t2.K
FULL JOIN @T4 t4 ON t4.K = t1.K OR t4.K = t2.K OR t4.K = t3.K
ORDER BY 1;
But this works well only for this simple example, if I have more Ks (join conditions) and Vs, the join conditions end up being a lot more complex.
How can this query be improved?
When two tables use the same column name(s), use table_name. column_name or table_alias. column_name format in SELECT clause to differentiate them in the result set. Use INNER JOIN whenever possible because OUTER JOIN uses a lot more system resources and is much more slower.
In this case the two tables are joined using the relationship table1.id = table2.id . It is possible to use multiple join statements together to join more than one table at the same time. To do that you add a second INNER JOIN statement and a second ON statement to indicate the third table and the second relationship.
SQL JOIN. A JOIN clause is used to combine rows from two or more tables, based on a related column between them. Notice that the "CustomerID" column in the "Orders" table refers to the "CustomerID" in the "Customers" table. The relationship between the two tables above is the "CustomerID" column.
There are several ways, but I think the following is the most popular:
select allk.k, t1.c1, t2.v2, t3.v3, t4.v4
from (select k from t1 union
select k from t2 union
select k from t3 union
select k from t4
) allk left outer join
t1
on allk.k = t1.k left outer join
t2
on allk.k = t2.k left outer join
t3
on allk.k = t3.k left outer join
t4
on allk.k = t4.k;
If you have a separate table of all the k
values, you can use that instead of the allk
subquery.
An easier way to write your full outer join
query is to use coalesce()
:
SELECT COALESCE(t1.K, t2.K, t3.K, t4.K) [K], V1, V2, V3, V4
FROM @T1 t1 FULL JOIN
@T2 t2
ON t2.K = t1.K FULL JOIN
@T3 t3
ON t3.K = coalesce(t1.K, t2.K) FULL JOIN
@T4 t4
ON t4.K = coalesce(t1.K, t2.K, t3.k);
And the third method is the union all
/group by
method:
select k, max(v1) as v1, max(v2) as v2, max(v3) as v3, max(v4) as v4
from (select k, v1, NULL as v2, NULL as v3, NULL as v4
from t1 union all
select k, NULL as v1, v2, NULL as v3, NULL as v4
from t2 union all
select k, NULL as v1, NULL as v2, v3, NULL as v4
from t3 union all
select k, NULL as v1, NULL as v2, NULL as v3, v4
from t4
) t
group by k;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With