Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Joining tables if the reference exists

I got a PostgreSQL database with 4 tables:

Table A

---------------------------
| ID | B_ID | C_ID | D_ID |
---------------------------
| 1  |  1   | NULL | NULL |
---------------------------
| 2  | NULL |   1  | NULL |
---------------------------
| 3  |  2   |   2  |   1  |
---------------------------
| 4  | NULL | NULL |   2  |
---------------------------

Table B

-------------
| ID | DATA |
-------------
| 1  | 123  |
-------------
| 2  | 456  |
-------------

Table C

-------------
| ID | DATA |
-------------
| 1  | 789  |
-------------
| 2  | 102  |
-------------

Table D

-------------
| ID | DATA |
-------------
| 1  | 654  |
-------------
| 2  | 321  |
-------------

I'm trying to retrieve a result set which has joined the data from table B and the data from table C, only if one of booth IDs is not null.

SELECT "Table_A"."ID", "Table_A"."ID_B", "Table_A"."ID_C", "Table_A"."ID_D", "Table_B"."DATA", "Table_C"."DATA"
    FROM "Table_A"
        LEFT JOIN "Table_B" on "Table_A"."ID_B" = "Table_B"."ID"
        LEFT JOIN "Table_C" on "Table_A"."ID_C" = "Table_C"."ID"
    WHERE "Table_A"."ID_B" IS NOT NULL OR "Table_A"."ID_C" IS NOT NULL;

Is this recommended or should I better split this in multiple queries?

Is there a way to do an inner join between these tables?

The result I expect is:

-------------------------------------------------
| ID | ID_B | ID_C | ID_D | DATA (B) | DATA (C) |
-------------------------------------------------
| 1  |   1  | NULL | NULL |   123    |   NULL   |
-------------------------------------------------
| 2  | NULL |  1   | NULL |   NULL   |   789    |
-------------------------------------------------
| 3  |   2  |  2   | NULL |   456    |   102    |
-------------------------------------------------

EDIT: ID_B, ID_C, ID_D are foreign keys to the tables table_b, table_c, table_d

like image 771
wiizzard Avatar asked May 25 '13 09:05

wiizzard


People also ask

How do you join tables based on conditions?

You join two tables by creating a relationship in the WHERE clause between at least one column from one table and at least one column from another. The join creates a temporary composite table where each pair of rows (one from each table) that satisfies the join condition is linked to form a single row.

Can we apply joins on 2 tables which has no related data?

Yes, you can! The longer answer is yes, there are a few ways to combine two tables without a common column, including CROSS JOIN (Cartesian product) and UNION. The latter is technically not a join but can be handy for merging tables in SQL.

Can you inner join a table that doesn't have matching records?

The JOIN or INNER JOIN does not return any non-matching rows at all. It returns only the rows that match in both of the tables you join. If you want to get any unmatched rows, you shouldn't use it.


3 Answers

The WHERE "Table_A"."ID_B" IS NOT NULL OR "Table_A"."ID_C" IS NOT NULL; can be replaced by the corresponding clause on the B and C tables : WHERE "Table_B"."ID" IS NOT NULL OR "Table_C"."ID" IS NOT NULL; . This would also work if table_a.id_b and table_a.id_c are not FKs to the B and C tables. Otherwise, a table_a row with { 5, 5,5,5} would retrieve two NULL rows from the B and C tables.

SELECT ta."ID" AS a_id
        , ta."ID_B" AS b_id
        , ta."ID_C" AS c_id
        , ta."ID_D" AS d_id
        , tb."DATA" AS bdata
        , tc."DATA" AS cdata
FROM "Table_a" ta
LEFT JOIN "Table_B" tb on ta."ID_B" = tb."ID"
LEFT JOIN "Table_C" tc on ta."ID_C" = tc."ID"
WHERE tb."ID" IS NOT NULL OR tc."ID" IS NOT NULL
        ;
like image 83
wildplasser Avatar answered Sep 21 '22 22:09

wildplasser


Since you have foreign key constraints in place, referential integrity is guaranteed and the query in your Q is already the best answer.

Also indexes on Table_B.ID and Table_C.ID are given.

If matching cases in Table_A are rare (less than ~ 5 %, depending on row with and data distribution) a partial multi-column index would help performance:

CREATE INDEX table_a_special_idx ON "Table_A" ("ID_B", "ID_C")
WHERE "ID_B" IS NOT NULL OR "ID_C" IS NOT NULL;

In PostgreSQL 9.2 a covering index (index-only scan in Postgres parlance) might help even more - in which case you would include all columns of interest in the index (not in my example). Depends on several factors like row width and frequency of updates in your table.

like image 29
Erwin Brandstetter Avatar answered Sep 18 '22 22:09

Erwin Brandstetter


Given your requirements, your query seems good to me.

An alternative would be to use nested selects in the projection, but depending on your data, indexes and constraints, that might be slower, as nested selects usually result in nested loops, whereas joins can be performed as merge joins or nested loops:

SELECT 
    "Table_A"."ID", 
    "Table_A"."ID_B", 
    "Table_A"."ID_C", 
    "Table_A"."ID_D", 
    (SELECT "DATA" FROM "Table_B" WHERE "Table_A"."ID_B" = "Table_B"."ID"),
    (SELECT "DATA" FROM "Table_C" WHERE "Table_A"."ID_C" = "Table_C"."ID")
FROM "Table_A"
WHERE "Table_A"."ID_B" IS NOT NULL OR "Table_A"."ID_C" IS NOT NULL;

If Postgres does scalar subquery caching (as Oracle does), then nested selects might help in case you have a lot of data repetition in Table_A

like image 38
Lukas Eder Avatar answered Sep 19 '22 22:09

Lukas Eder