Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Oracle hierarchical sum (distance from leaf to root)

Tags:

I would like to get help for a hierarchical query (Oracle 11gR2). I have a hard time with those kind of queries...

In fact, it's a 2 in 1 question (2 different approches needed).

I’m looking for a way to get the distance from all individials records to the root (not the opposite). My data are in a tree like structure:

CREATE TABLE MY_TREE
(ID_NAME VARCHAR2(1) PRIMARY KEY,
 PARENT_ID VARCHAR2(1),
 PARENT_DISTANCE NUMBER(2)
);

INSERT INTO MY_TREE (ID_NAME,PARENT_ID,PARENT_DISTANCE) VALUES('A',NULL,NULL);
INSERT INTO MY_TREE (ID_NAME,PARENT_ID,PARENT_DISTANCE) VALUES('B','A',1);
INSERT INTO MY_TREE (ID_NAME,PARENT_ID,PARENT_DISTANCE) VALUES('C','B',3);
INSERT INTO MY_TREE (ID_NAME,PARENT_ID,PARENT_DISTANCE) VALUES('D','B',5);
INSERT INTO MY_TREE (ID_NAME,PARENT_ID,PARENT_DISTANCE) VALUES('E','C',7);
INSERT INTO MY_TREE (ID_NAME,PARENT_ID,PARENT_DISTANCE) VALUES('F','D',11);
INSERT INTO MY_TREE (ID_NAME,PARENT_ID,PARENT_DISTANCE) VALUES('G','D',13);

Hierarchically, my data look like this (but I have multiple independents roots and many more levels):

enter image description here

In the first approch, I'm looking for a query that will give me this result:

LEVEL ROOT NODE ID_NAME ROOT_DISTANCE
----- ---- ---- ------- -------------
1     A    null A       null
2     A    null B       1
3     A    B    C       4
4     A    B    E       11
3     A    B    D       6
4     A    D    F       17
4     A    D    G       19

In this result,

  • the "NODE" column mean the ID_NAME of the closest split element
  • the "ROOT_DISTANCE" column mean the distance from an element to the root (ex: the ROOT_DISTANCE for ID_NAME=G is the distance from G to A: G(13)+D(5)+B(1)=19)

In this approch, I will always specify a maximum of 2 roots.

The second approch must be a PL/SQL script that will do the same calculation (ROOT_DISTANCE), but in a iterative way, and that will write the result in a new table. I want to run this script one time, so all roots (~1000) will be processed.

Here's the way I see the script:

  • For all roots, we need to find associated leafs and then calculate the distance from the leaf to the root (for all element between the leaf and the root) and put this into a table.

This script is needed for "performance perspectives", so if an element have been already calculated (ex: a split node that was calculated by another leaf), we need to stop the calculation and pass to the next leaf because we already know the result from there to the root. For example, if the system calculates E-C-B-A, and then F-D-B-A, the B-A section should not be calcultated again because it was done in the first pass.

You can awnser one or both of those questions, but i will need the awnser to those two questions.

Thank you!

like image 549
JGLord Avatar asked Dec 08 '18 14:12

JGLord


1 Answers

Try this one:

WITH brumba(le_vel,root,node,id_name,root_distance) AS (
  SELECT 1 as le_vel, id_name as root, null as node, id_name, to_number(null) as root_distance  
  FROM MY_TREE WHERE parent_id IS NULL
  UNION ALL
  SELECT b.le_vel + 1, b.root, 
         CASE WHEN 1 < (
                SELECT count(*) FROM MY_TREE t1 WHERE t1.parent_id = t.parent_id
              )
              THEN t.parent_id ELSE b.node
         END,
         t.id_name, coalesce(b.root_distance,0)+t.parent_distance
  FROM MY_TREE t
  JOIN brumba b ON b.id_name = t.parent_id
)
SELECT * FROM brumba

Demo: https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=d5c231055e989c3cbcd763f4b3d3033f


There is no need for "the second approch" using PL/SQL - the above SQL will calculate results for all root nodes (which have null in parent_id column) at once.
Just add a prefix either INSERT INTO tablename(col1,col2, ... colN) ... or CREATE TABLE name AS ... to the above query.
The above demo contains an example for the latter option CREATE TABLE xxx AS query

like image 103
krokodilko Avatar answered Oct 07 '22 16:10

krokodilko