I hope the title is clear enough. I've been implementing logical AND/OR for tree structures which are kept in the database, using a simple nodes and a parent-child association table. A sample tree has a structure like this:
A sample tree structure query is as follows:
The double lines in the query pattern mean that A has a child of type B (somewhere down its child nodes) OR C. I have implemented A -> HASCHILD -> C -> HASCHILD -> E with an inner join, and A -> HASCHILD -> B -> HASCHILD -> E is implemented like this. The trick is joining these two branches on A. Since this is an OR operation, either B branch or C branch may not exist. The only method I could think of if to use full outer joins of two branches with A's node_id as the key. To avoid details, let me give this simplified snippet from my SQL query:
WITH A as (....),
B as (....),
C as (....),
......
SELECT *
from
A
INNER JOIN A_CONTAINS_B ON A.NODE_ID = A_CONTAINS_B.parent
INNER JOIN B ON A_CONTAINS_B.children @> ARRAY[B.NODE_ID]
INNER JOIN .....
full OUTER JOIN -- THIS IS WHERE TWO As ARE JOINED
(select
A2.NODE_ID AS A2_NODE_ID
from
A2
INNER JOIN A_CONTAINS_C ON A2.NODE_ID = C_CONTAINS_C.parent
INNER JOIN C ON A_CONTAINS_C.children @> ARRAY[C.NODE_ID]
INNER JOIN ....)
as other_branch
ON other_branch.A2_NODE_ID = A.NODE_ID
This query links two As which actually represent the same A using node_id, and if B or C does not exist, nothing breaks. The result set has duplicates of course, but I can live with that. I can't however think of another way to implement OR in this context. ANDs are easy, they are inner joins, but left outer join is the only approach that lets me connect As. UNION ALL with dummy columns for both branches is not an option because I can't connect As in that case.
Do you have any alternatives to what I'm doing here?
UPDATE
TokenMacGuy's suggestion gives me a cleaner route than what I have at the moment. I should have remembered UNION. Using the first approach he has suggested, I can apply a query pattern decomposition, which would be a consistent way of breaking down queries with logical operators. The following is a visual representation of what I'm going to do, just in case it helps someone else visualize the process:
This helps me do a lot of nice things, including creating a nice result set where query pattern components are linked to results. I've deliberately avoided details of tables or other context, because my question is about how to join results of queries. How I handle the hierarchy in DB is a different topic which I'd like to avoid. I'll add more details into comments. This is basically an EAV table accomponied by a hierarchy table. Just in case someone would like to see it, here is the query I'm running without any simplifications, after following TokenMacGuy's suggestion:
WITH
COMPOSITION1 as (select comp1.* from temp_eav_table_global as comp1
WHERE
comp1.actualrmtypename = 'COMPOSITION'),
composition_contains_observation as (select * from parent_child_arr_based),
OBSERVATION as (select obs.* from temp_eav_table_global as obs
WHERE
obs.actualrmtypename = 'OBSERVATION'),
observation_cnt_element as (select * from parent_child_arr_based),
OBS_ELM as (select obs_elm.* from temp_eav_table_global as obs_elm
WHERE
obs_elm.actualrmtypename= 'ELEMENT'),
COMPOSITION2 as (select comp_node_tbl2.* from temp_eav_table_global as comp_node_tbl2
where
comp_node_tbl2.actualrmtypename = 'COMPOSITION'),
composition_contains_evaluation as (select * from parent_child_arr_based),
EVALUATION as (select eva_node_tbl.* from temp_eav_table_global as eva_node_tbl
where
eva_node_tbl.actualrmtypename = 'EVALUATION'),
eval_contains_element as (select * from parent_child_arr_based),
ELEMENT as (select el_node_tbl.* from temp_eav_table_global as el_node_tbl
where
el_node_tbl.actualrmtypename = 'ELEMENT')
select
'branch1' as branchid,
COMPOSITION1.featuremappingid as comprootid,
OBSERVATION.featuremappingid as obs_ftid,
OBSERVATION.actualrmtypename as obs_tn,
null as ev_ftid,
null as ev_tn,
OBS_ELM.featuremappingid as obs_elm_fid,
OBS_ELm.actualrmtypename as obs_elm_tn,
null as ev_el_ftid,
null as ev_el_tn
from
COMPOSITION1
INNER JOIN composition_contains_observation ON COMPOSITION1.featuremappingid = composition_contains_observation.parent
INNER JOIN OBSERVATION ON composition_contains_observation.children @> ARRAY[OBSERVATION.featuremappingid]
INNER JOIN observation_cnt_element on observation_cnt_element.parent = OBSERVATION.featuremappingid
INNER JOIN OBS_ELM ON observation_cnt_element.children @> ARRAY[obs_elm.featuremappingid]
UNION
SELECT
'branch2' as branchid,
COMPOSITION2.featuremappingid as comprootid,
null as obs_ftid,
null as obs_tn,
EVALUATION.featuremappingid as ev_ftid,
EVALUATION.actualrmtypename as ev_tn,
null as obs_elm_fid,
null as obs_elm_tn,
ELEMENT.featuremappingid as ev_el_ftid,
ELEMENT.actualrmtypename as ev_el_tn
from
COMPOSITION2
INNER JOIN composition_contains_evaluation ON COMPOSITION2.featuremappingid = composition_contains_evaluation.parent
INNER JOIN EVALUATION ON composition_contains_evaluation.children @> ARRAY[EVALUATION.featuremappingid]
INNER JOIN eval_contains_element ON EVALUATION.featuremappingid = eval_contains_element.parent
INNER JOIN ELEMENT on eval_contains_element.children @> ARRAY[ELEMENT.featuremappingid]
There is also a script here comparing a FULL OUTER JOIN with an equivalent CROSS JOIN. If the columns you are joining/grouping on are foreign key references to indexed tables in your database, a CROSS JOIN also can be slightly more efficient (in addition to being shorter and clearer) than using a FULL OUTER JOIN.
The sample data suggest that you could use a UNION instead of the FULL JOIN.
Union will be faster, as it simply passes the first SELECT statement, and then parses the second SELECT statement and adds the results to the end of the output table.
You're getting that error because MySQL does not support (or recognize) the FULL OUTER JOIN syntax. However, it is possible emulate a FULL OUTER JOIN in MySQL. We actually need two queries. One query return all the rows from the table on the left.
the relational equivalent to ∨ is ⋃. You could either use union
to combine a JOIN b JOIN e
with a JOIN c JOIN e
or just use the union of b and c and join on the resulting, combined relation, something like a JOIN (b UNION c) JOIN e
More completely:
SELECT *
FROM a
JOIN (
SELECT
'B' source_relation,
parent,
b.child,
b_thing row_from_b,
NULL row_from_c
FROM a_contains_b JOIN b ON a_contains_b.child = b.node_id
UNION
SELECT
'C',
parent
c.child,
NULL,
c_thing
FROM a_contains_c JOIN c ON a_contains_c.child = c.node_id
) a_c ON A.NODE_ID = a_e.parent
JOIN e ON a_c.child = e.node_id;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With