Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting counts/totals at each level of a hierarchical query using CONNECT BY

Tags:

sql

oracle

I am having a heck of a time with this. I am trying to write a query (using Oracle), against a table with a recursive relationship (hierarchical) and get the total number of records stored in another table at and below each node in the tree. The other table only has records associated with the leaf nodes. However, I want to get totals at and below each node in the tree. For example, say I have two tables. DIRS contains the directory names and a recursive relationship identifying the structure of the directories, and FILES contains file information with a foreign key to DIRS indicating the directory the file resides in:

DIRS
====
DIR_ID 
PARENT_DIR_ID
DIR_NAME

FILES
=====
FILE_ID
FILE_NAME
DIR_ID
FILE_SIZE

If DIRS contains:

DIR_ID   PARENT_DIR_ID   DIR_NAME
======   =============   ========
1                        ROOT
2        1               DIR1_1
3        1               DIR1_2
4        2               DIR2_1
5        2               DIR2_2

and FILES contains

FILE_ID   FILE_NAME   DIR_ID   FILE_SIZE
=======   =========   ======   =========
1         test1.txt   5        100
2         test2.txt   5        200
3         test5.txt   5         50 
4         test3.txt   3        300
5         test4.txt   3        300
6         test6.txt   4        100

I want a query that returns the path along with the number of files in or below each node in the hierarchy. Basically a rollup of the number of files. So the query result would look something like:

Path                    File_Count
=====                   ===========
/ROOT                   6
/ROOT/DIR1_1            4
/ROOT/DIR1_1/DIR2_1     1
/ROOT/DIR1_1/DIR2_2     3
/ROOT/DIR1_2            2

UPDATE SQL script to create the tables with example data to match the above:

create table DIRS (dir_id number(38) primary key
    , parent_dir_id number(38) null references DIRS(dir_id)
    , dir_name varchar2(128) not null);

create table FILES (file_id number(38) primary key
    , file_name varchar2(128) not null
    , dir_id number(38) not null references DIRS(dir_id)
    , file_size number not null
    , unique (dir_id, file_name));

insert into DIRS 
select 1, null, 'ROOT' from dual
union all select 2, 1, 'DIR1_1' from dual 
union all select 3, 1, 'DIR1_2' from dual 
union all select 4, 2, 'DIR2_1' from dual 
union all select 5, 2, 'DIR2_2' from dual;

insert into files
select 1, 'test1.txt', 5, 100 from dual
union all select 2, 'test2.txt', 5, 200 from dual
union all select 3, 'test5.txt', 5, 50 from dual
union all select 4, 'test3.txt', 3, 300 from dual
union all select 5, 'test4.txt', 3, 300 from dual
union all select 6, 'test6.txt', 4, 100 from dual;

commit;
like image 602
GregH Avatar asked Sep 10 '12 22:09

GregH


People also ask

How fetch data is hierarchical in SQL?

Use hierarchyid as a data type to create tables with a hierarchical structure, or to describe the hierarchical structure of data that is stored in another location. Use the hierarchyid functions in Transact-SQL to query and manage hierarchical data.

What is the use of Connect by Level in Oracle?

The CONNECT BY clause defines the hierarchical relationship between the parent rows and the child rows of the hierarchy. DUAL is a dummy table automatically generated by Oracle database along with data dictionary. Explanation: Above query will execute Level having initial value 1 from dual, which is dummy table.

How do you write a hierarchical query?

START WITH specifies the root row(s) of the hierarchy. CONNECT BY specifies the relationship between parent rows and child rows of the hierarchy. The NOCYCLE parameter instructs Oracle Database to return rows from a query even if a CONNECT BY loop exists in the data.

How do you use Connect by?

The CONNECT BY clause should contain one or more expressions similar to those used in joins. Specifically, a column in the “current” level of the table should refer to a column in the “prior” (higher) level of the table. For example, in a manager/employee hierarchy, the clause might look similar to: ...


2 Answers

This one is pretty straightforward:

09:38:54 HR@vm_xe> l                                      
  1  select sys_connect_by_path(dp.dir_name, '/') path    
  2         ,(select count(file_id)                       
  3             from dirs dc                              
  4                  ,files f                             
  5            where f.dir_id(+) = dc.dir_id              
  6          connect by prior dc.dir_id = dc.parent_dir_id
  7            start with dc.dir_id = dp.dir_id           
  8          ) count                                      
  9    from dirs dp                                       
 10    connect by prior dp.dir_id = dp.parent_dir_id      
 11*   start with dp.parent_dir_id is null                
09:38:55 HR@vm_xe> /                                      

PATH                                COUNT                 
------------------------------ ----------                 
/ROOT                                   6                 
/ROOT/DIR1_1                            4                 
/ROOT/DIR1_1/DIR2_1                     1                 
/ROOT/DIR1_1/DIR2_2                     3                 
/ROOT/DIR1_2                            2                 

5 rows selected.                                          

Elapsed: 00:00:00.02                                      
like image 136
Kirill Leontev Avatar answered Sep 28 '22 17:09

Kirill Leontev


select sys_connect_by_path(D.dir_name, '/'), S.count_distinct_file_id
from DIRS D
inner join (select subtree_root_dir_id
            , count(distinct file_id) count_distinct_file_id
        from (select distinct connect_by_root D.DIR_ID subtree_root_dir_id
                    , F.file_id 
                from DIRS D
                left outer join FILES F on F.dir_id = D.dir_id
                start with 1=1 connect by prior D.dir_id = D.parent_dir_id)
        group by subtree_root_dir_id) S
    on D.dir_id = S.subtree_root_dir_id
start with D.dir_id = 1 connect by prior D.dir_id = D.parent_dir_id

Gives the results you asked for, but my gut says I am not seeing something, and that the query can be much simpler. (Please do not accept this answer until a few days have passed, in hopes that some one submits a better answer.)

like image 40
Shannon Severance Avatar answered Sep 28 '22 16:09

Shannon Severance