I am having a heck of a time with this. I am trying to write a query (using Oracle), against a table with a recursive relationship (hierarchical) and get the total number of records stored in another table at and below each node in the tree. The other table only has records associated with the leaf nodes. However, I want to get totals at and below each node in the tree. For example, say I have two tables. DIRS contains the directory names and a recursive relationship identifying the structure of the directories, and FILES contains file information with a foreign key to DIRS indicating the directory the file resides in:
DIRS
====
DIR_ID
PARENT_DIR_ID
DIR_NAME
FILES
=====
FILE_ID
FILE_NAME
DIR_ID
FILE_SIZE
If DIRS contains:
DIR_ID PARENT_DIR_ID DIR_NAME
====== ============= ========
1 ROOT
2 1 DIR1_1
3 1 DIR1_2
4 2 DIR2_1
5 2 DIR2_2
and FILES contains
FILE_ID FILE_NAME DIR_ID FILE_SIZE
======= ========= ====== =========
1 test1.txt 5 100
2 test2.txt 5 200
3 test5.txt 5 50
4 test3.txt 3 300
5 test4.txt 3 300
6 test6.txt 4 100
I want a query that returns the path along with the number of files in or below each node in the hierarchy. Basically a rollup of the number of files. So the query result would look something like:
Path File_Count
===== ===========
/ROOT 6
/ROOT/DIR1_1 4
/ROOT/DIR1_1/DIR2_1 1
/ROOT/DIR1_1/DIR2_2 3
/ROOT/DIR1_2 2
UPDATE SQL script to create the tables with example data to match the above:
create table DIRS (dir_id number(38) primary key
, parent_dir_id number(38) null references DIRS(dir_id)
, dir_name varchar2(128) not null);
create table FILES (file_id number(38) primary key
, file_name varchar2(128) not null
, dir_id number(38) not null references DIRS(dir_id)
, file_size number not null
, unique (dir_id, file_name));
insert into DIRS
select 1, null, 'ROOT' from dual
union all select 2, 1, 'DIR1_1' from dual
union all select 3, 1, 'DIR1_2' from dual
union all select 4, 2, 'DIR2_1' from dual
union all select 5, 2, 'DIR2_2' from dual;
insert into files
select 1, 'test1.txt', 5, 100 from dual
union all select 2, 'test2.txt', 5, 200 from dual
union all select 3, 'test5.txt', 5, 50 from dual
union all select 4, 'test3.txt', 3, 300 from dual
union all select 5, 'test4.txt', 3, 300 from dual
union all select 6, 'test6.txt', 4, 100 from dual;
commit;
Use hierarchyid as a data type to create tables with a hierarchical structure, or to describe the hierarchical structure of data that is stored in another location. Use the hierarchyid functions in Transact-SQL to query and manage hierarchical data.
The CONNECT BY clause defines the hierarchical relationship between the parent rows and the child rows of the hierarchy. DUAL is a dummy table automatically generated by Oracle database along with data dictionary. Explanation: Above query will execute Level having initial value 1 from dual, which is dummy table.
START WITH specifies the root row(s) of the hierarchy. CONNECT BY specifies the relationship between parent rows and child rows of the hierarchy. The NOCYCLE parameter instructs Oracle Database to return rows from a query even if a CONNECT BY loop exists in the data.
The CONNECT BY clause should contain one or more expressions similar to those used in joins. Specifically, a column in the “current” level of the table should refer to a column in the “prior” (higher) level of the table. For example, in a manager/employee hierarchy, the clause might look similar to: ...
This one is pretty straightforward:
09:38:54 HR@vm_xe> l
1 select sys_connect_by_path(dp.dir_name, '/') path
2 ,(select count(file_id)
3 from dirs dc
4 ,files f
5 where f.dir_id(+) = dc.dir_id
6 connect by prior dc.dir_id = dc.parent_dir_id
7 start with dc.dir_id = dp.dir_id
8 ) count
9 from dirs dp
10 connect by prior dp.dir_id = dp.parent_dir_id
11* start with dp.parent_dir_id is null
09:38:55 HR@vm_xe> /
PATH COUNT
------------------------------ ----------
/ROOT 6
/ROOT/DIR1_1 4
/ROOT/DIR1_1/DIR2_1 1
/ROOT/DIR1_1/DIR2_2 3
/ROOT/DIR1_2 2
5 rows selected.
Elapsed: 00:00:00.02
select sys_connect_by_path(D.dir_name, '/'), S.count_distinct_file_id
from DIRS D
inner join (select subtree_root_dir_id
, count(distinct file_id) count_distinct_file_id
from (select distinct connect_by_root D.DIR_ID subtree_root_dir_id
, F.file_id
from DIRS D
left outer join FILES F on F.dir_id = D.dir_id
start with 1=1 connect by prior D.dir_id = D.parent_dir_id)
group by subtree_root_dir_id) S
on D.dir_id = S.subtree_root_dir_id
start with D.dir_id = 1 connect by prior D.dir_id = D.parent_dir_id
Gives the results you asked for, but my gut says I am not seeing something, and that the query can be much simpler. (Please do not accept this answer until a few days have passed, in hopes that some one submits a better answer.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With