Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recursive Insert using connect by clause

I have hierarchical data (right) in table in following manner which creates Hierarchy as shown in left. Tables are kept in oracle 11g.

TREE Hierarchy          Tree Table  
--------------          Element Parent
                        ------  ------
P0                      P0  
    P1                  P1      P0
        P11             P2      P0
            C111        P11     P1
            C112        P12     P1
        P12             P21     P2
            C121        P22     P2
            C122        C111    P11
    P2                  C112    P11
        P21             C121    P12
            C211        C122    P12
            C212        C211    P21
        P22             C212    P21
            C221        C221    P22
            C222        C222    P22

My data table has values as follows. It contains values for all leaf nodes.
Data Table

Element Value  
C111    3  
C112    3  
C121    3  
C122    3  
C211    3  
C212    3  
C221    3  
C222    3  
P11     6  

I need to generate insert statement, preferably single insert statement which will insert rows in data table based on sum of values of the children. Please note we need to calculate sum for only those parents whose value is not present in data table.

Data Table (Expected After Insert)

Element Value
C111    3
C112    3
C121    3
C122    3
C211    3
C212    3
C221    3
C222    3
P11     6

-- Rows to insert
P12     6
P21     6
P22     6
P1      12
P2      12
P0      24
like image 754
BigBoss Avatar asked Nov 22 '11 10:11

BigBoss


Video Answer


2 Answers

If all leaf nodes are at the same height (here lvl=4), you can write a simple CONNECT BY query with a ROLLUP:

SQL> SELECT lvl0,
  2         regexp_substr(path, '[^/]+', 1, 2) lvl1,
  3         regexp_substr(path, '[^/]+', 1, 3) lvl2,
  4         SUM(VALUE) sum_value
  5    FROM (SELECT sys_connect_by_path(t.element, '/') path,
  6                 connect_by_root(t.element) lvl0,
  7                 t.element, d.VALUE, LEVEL lvl
  8             FROM tree t
  9             LEFT JOIN DATA d ON d.element = t.element
 10            START WITH t.PARENT IS NULL
 11           CONNECT BY t.PARENT = PRIOR t.element)
 12   WHERE VALUE IS NOT NULL
 13     AND lvl = 4
 14   GROUP BY lvl0, ROLLUP(regexp_substr(path, '[^/]+', 1, 2),
 15                         regexp_substr(path, '[^/]+', 1, 3));

LVL0 LVL1  LVL2   SUM_VALUE
---- ----- ----- ----------
P0   P1    P11            6
P0   P1    P12            6
P0   P1                  12
P0   P2    P21            6
P0   P2    P22            6
P0   P2                  12
P0                       24

The insert would look like:

INSERT INTO data (element, value) 
(SELECT coalesce(lvl2, lvl1, lvl0), sum_value
   FROM <query> d_out
  WHERE NOT EXISTS (SELECT NULL
                      FROM data d_in
                     WHERE d_in.element = coalesce(lvl2, lvl1, lvl0)));

If the height of the leaf nodes is unknown/unbounded this gets more hairy. The above approach wouldn't work since ROLLUP needs to know exactly how many columns are to be considered.

In that case, you could use the tree structure in a self-join :

SQL> WITH HIERARCHY AS (
  2     SELECT t.element, path, VALUE
  3       FROM (SELECT sys_connect_by_path(t.element, '/') path,
  4                    connect_by_isleaf is_leaf, ELEMENT
  5                FROM tree t
  6               START WITH t.PARENT IS NULL
  7              CONNECT BY t.PARENT = PRIOR t.element) t
  8       LEFT JOIN DATA d ON d.element = t.element
  9                       AND t.is_leaf = 1
 10  )
 11  SELECT h.element, SUM(elements.value)
 12    FROM HIERARCHY h
 13    JOIN HIERARCHY elements ON elements.path LIKE h.path||'/%'
 14   WHERE h.VALUE IS NULL
 15   GROUP BY h.element
 16   ORDER BY 1;

ELEMENT SUM(ELEMENTS.VALUE)
------- -------------------
P0                       24
P1                       12
P11                       6
P12                       6
P2                       12
P21                       6
P22                       6
like image 66
Vincent Malgrat Avatar answered Oct 09 '22 17:10

Vincent Malgrat


Here is another option using the SQL MODEL clause. I've taken some hints from what Vincent has done in his answer (use of regexp_subsr) to simplify my code.

The first part, within the WITH clause just rejigs the data and extracts out the hierarchy at each level.

The model clause, at the end of the query, brings the data up from the lowest levels. This will need additional columns added if there are more than four levels but should work no matter at what level the values are held.

I'm not entirely sure that this will work in all circumstances since I'm not that experienced with the MODEL clause but it does at least seem to work in this case.

with my_hierarchy_data as (
select 
    element,
    value, 
    path, 
    parent,
    lvl0,
    regexp_substr(path, '[^/]+', 1, 2) as lvl1,
    regexp_substr(path, '[^/]+', 1, 3) as lvl2,
    regexp_substr(path, '[^/]+', 1, 4) as lvl3
from ( 
  select 
    element,
    value, 
    parent,
    sys_connect_by_path(element, '/') as path, 
    connect_by_root element as lvl0
  from 
    tree
    left outer join data using (element)
  start with parent is null
  connect by prior element = parent
  order siblings by element
  )
)
select 
    element,
    value, 
    path, 
    parent,
    new_value,
    lvl0, 
    lvl1, 
    lvl2, 
    lvl3
from my_hierarchy_data
model
return all rows
partition by (lvl0)
dimension by (lvl1, lvl2, lvl3)
measures(element, parent, value, value as new_value, path)
rules sequential order (
    new_value[lvl1, lvl2, null] = sum(value)[cv(lvl1), cv(lvl2), lvl3 is not null],
    new_value[lvl1, null, null] = sum(new_value)[cv(lvl1), lvl2 is not null, null],
    new_value[null, null, null] = sum(new_value)[lvl1 is not null, null, null]
)

The insert statement you can use is

INSERT INTO data (elelment, value)
select element, newvalue
from <the_query>
where value is null;
like image 33
Mike Meyers Avatar answered Oct 09 '22 15:10

Mike Meyers