Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Alternatives to full outer join for logical OR in tree structure query

I hope the title is clear enough. I've been implementing logical AND/OR for tree structures which are kept in the database, using a simple nodes and a parent-child association table. A sample tree has a structure like this:

nodes and their children

A sample tree structure query is as follows:

A contains children of certain types, which in turn contain other nodes

The double lines in the query pattern mean that A has a child of type B (somewhere down its child nodes) OR C. I have implemented A -> HASCHILD -> C -> HASCHILD -> E with an inner join, and A -> HASCHILD -> B -> HASCHILD -> E is implemented like this. The trick is joining these two branches on A. Since this is an OR operation, either B branch or C branch may not exist. The only method I could think of if to use full outer joins of two branches with A's node_id as the key. To avoid details, let me give this simplified snippet from my SQL query:

    WITH A as (....),
    B as (....),
    C as (....),
    ......

SELECT * 
      from
          A
          INNER JOIN A_CONTAINS_B ON  A.NODE_ID = A_CONTAINS_B.parent
          INNER JOIN B ON A_CONTAINS_B.children @> ARRAY[B.NODE_ID]
          INNER JOIN .....

          full OUTER JOIN -- THIS IS WHERE TWO As ARE JOINED
          (select
          A2.NODE_ID AS A2_NODE_ID
          from
          A2
          INNER JOIN A_CONTAINS_C ON A2.NODE_ID = C_CONTAINS_C.parent
          INNER JOIN C ON A_CONTAINS_C.children @> ARRAY[C.NODE_ID]
          INNER JOIN ....) 
          as other_branch
          ON other_branch.A2_NODE_ID = A.NODE_ID

This query links two As which actually represent the same A using node_id, and if B or C does not exist, nothing breaks. The result set has duplicates of course, but I can live with that. I can't however think of another way to implement OR in this context. ANDs are easy, they are inner joins, but left outer join is the only approach that lets me connect As. UNION ALL with dummy columns for both branches is not an option because I can't connect As in that case.

Do you have any alternatives to what I'm doing here?

UPDATE

TokenMacGuy's suggestion gives me a cleaner route than what I have at the moment. I should have remembered UNION. Using the first approach he has suggested, I can apply a query pattern decomposition, which would be a consistent way of breaking down queries with logical operators. The following is a visual representation of what I'm going to do, just in case it helps someone else visualize the process:

Tree query decomposition

This helps me do a lot of nice things, including creating a nice result set where query pattern components are linked to results. I've deliberately avoided details of tables or other context, because my question is about how to join results of queries. How I handle the hierarchy in DB is a different topic which I'd like to avoid. I'll add more details into comments. This is basically an EAV table accomponied by a hierarchy table. Just in case someone would like to see it, here is the query I'm running without any simplifications, after following TokenMacGuy's suggestion:

WITH
    COMPOSITION1 as (select comp1.* from temp_eav_table_global as comp1
                      WHERE
                      comp1.actualrmtypename = 'COMPOSITION'),
    composition_contains_observation as (select * from parent_child_arr_based),
    OBSERVATION as (select obs.* from temp_eav_table_global as obs
                        WHERE
                        obs.actualrmtypename = 'OBSERVATION'),
    observation_cnt_element as (select * from parent_child_arr_based),
    OBS_ELM as (select obs_elm.* from temp_eav_table_global as obs_elm
                        WHERE
                        obs_elm.actualrmtypename= 'ELEMENT'),

     COMPOSITION2 as (select comp_node_tbl2.* from temp_eav_table_global as comp_node_tbl2
                          where
                          comp_node_tbl2.actualrmtypename = 'COMPOSITION'),
    composition_contains_evaluation as (select * from parent_child_arr_based),
    EVALUATION as (select eva_node_tbl.* from temp_eav_table_global as eva_node_tbl
                            where
                            eva_node_tbl.actualrmtypename = 'EVALUATION'),
    eval_contains_element as (select * from parent_child_arr_based),
    ELEMENT as (select el_node_tbl.* from temp_eav_table_global as el_node_tbl
                            where
                            el_node_tbl.actualrmtypename = 'ELEMENT')



select
                      'branch1' as branchid,
                      COMPOSITION1.featuremappingid as comprootid,
                      OBSERVATION.featuremappingid as obs_ftid,
                      OBSERVATION.actualrmtypename as obs_tn,
                      null as ev_ftid,
                      null as ev_tn,
                      OBS_ELM.featuremappingid as obs_elm_fid,
                      OBS_ELm.actualrmtypename as obs_elm_tn,
                      null as ev_el_ftid,
                      null as ev_el_tn

                      from
                      COMPOSITION1
                      INNER JOIN composition_contains_observation ON COMPOSITION1.featuremappingid = composition_contains_observation.parent
                      INNER JOIN OBSERVATION ON composition_contains_observation.children @> ARRAY[OBSERVATION.featuremappingid]
                      INNER JOIN observation_cnt_element on observation_cnt_element.parent = OBSERVATION.featuremappingid
                      INNER JOIN OBS_ELM ON observation_cnt_element.children @> ARRAY[obs_elm.featuremappingid]

UNION

SELECT                  
                        'branch2' as branchid,
                        COMPOSITION2.featuremappingid as comprootid,
                        null as obs_ftid,
                        null as obs_tn,
                        EVALUATION.featuremappingid as ev_ftid,
                        EVALUATION.actualrmtypename as ev_tn,
                        null as obs_elm_fid,
                        null as obs_elm_tn,
                        ELEMENT.featuremappingid as ev_el_ftid,
                        ELEMENT.actualrmtypename as ev_el_tn                       
                  from
                      COMPOSITION2
                      INNER JOIN composition_contains_evaluation ON  COMPOSITION2.featuremappingid = composition_contains_evaluation.parent
                      INNER JOIN EVALUATION ON composition_contains_evaluation.children @> ARRAY[EVALUATION.featuremappingid]
                      INNER JOIN eval_contains_element ON EVALUATION.featuremappingid = eval_contains_element.parent
                      INNER JOIN ELEMENT on eval_contains_element.children @> ARRAY[ELEMENT.featuremappingid]
like image 436
mahonya Avatar asked Oct 21 '12 21:10

mahonya


People also ask

What is alternative for full outer join?

There is also a script here comparing a FULL OUTER JOIN with an equivalent CROSS JOIN. If the columns you are joining/grouping on are foreign key references to indexed tables in your database, a CROSS JOIN also can be slightly more efficient (in addition to being shorter and clearer) than using a FULL OUTER JOIN.

What is the alternative of full join in SQL?

The sample data suggest that you could use a UNION instead of the FULL JOIN.

Which is faster union or full outer join?

Union will be faster, as it simply passes the first SELECT statement, and then parses the second SELECT statement and adds the results to the end of the output table.

Why cant I use full outer join in SQL?

You're getting that error because MySQL does not support (or recognize) the FULL OUTER JOIN syntax. However, it is possible emulate a FULL OUTER JOIN in MySQL. We actually need two queries. One query return all the rows from the table on the left.


1 Answers

the relational equivalent to ∨ is ⋃. You could either use union to combine a JOIN b JOIN e with a JOIN c JOIN e or just use the union of b and c and join on the resulting, combined relation, something like a JOIN (b UNION c) JOIN e

More completely:

SELECT *
FROM a
JOIN (
    SELECT
        'B' source_relation,
        parent,
        b.child,
        b_thing row_from_b,
        NULL row_from_c
    FROM a_contains_b JOIN b ON a_contains_b.child = b.node_id

    UNION
    SELECT
        'C',
        parent
        c.child,
        NULL,
        c_thing
    FROM a_contains_c JOIN c ON a_contains_c.child = c.node_id
) a_c ON A.NODE_ID = a_e.parent
JOIN e ON a_c.child = e.node_id;
like image 146
SingleNegationElimination Avatar answered Sep 29 '22 04:09

SingleNegationElimination