I hava data in an Oracle table that is organized as a graph that can contain cycles (see example).
CREATE TABLE T (parent INTEGER, child INTEGER)
AS select 1 parent, 2 child from dual
union all select 1 parent, 8 child from dual
union all select 2 parent, 3 child from dual
union all select 2 parent, 4 child from dual
union all select 2 parent, 8 child from dual
union all select 3 parent, 4 child from dual
union all select 3 parent, 6 child from dual
union all select 4 parent, 5 child from dual
union all select 5 parent, 8 child from dual
union all select 6 parent, 5 child from dual
union all select 7 parent, 3 child from dual
union all select 7 parent, 5 child from dual
union all select 8 parent, 6 child from dual
My goal is to get all nodes that are descendants (children, children of children, etc.) of node X. Let's say 2. My expected result is then: 3, 4, 5, 6, 8.
I know that I can design a query like this:
SELECT child, sys_connect_by_path(child,'/')
FROM T
START WITH parent = 2
CONNECT BY NOCYCLE PRIOR child = PARENT;
The problem with such a query is that it will go through all possible paths until they cycle, and there are way too many of them in my actual data. The result consists of many duplicates – Here it is:
child | sys_connect_by_path (for information)
3 | /3
4 | /3/4
5 | /3/4/5
8 | /3/4/5/8
6 | /3/4/5/8/6
6 | /3/6
5 | /3/6/5
8 | /3/6/5/8
4 | /4
5 | /4/5
8 | /4/5/8
6 | /4/5/8/6
8 | /8
6 | /8/6
5 | /8/6/5
My actual data is much more complex. the cost of execution of such a query is so huge that my TEMP tablespace, which was autoextendable, reached 10Gb (originally 500 Mb) and my database actually broke because of disk full.
I tried to design the query like this (recursive WITH clause) :
WITH descendants(node) AS
( SELECT 2 node FROM dual
UNION ALL
(
SELECT child
FROM T
INNER JOIN descendants D
ON T.parent = D.node
MINUS SELECT node FROM descendants
)
)
SELECT * FROM descendants
The problem that I encounter is:
ORA-32033: unsupported column aliasing
, and some customers use Oracle 9 or 10),ORA-32041: UNION ALL operation in recursive WITH clause must have only two branches
. If I remove the MINUS clause I will get cycles (ORA-32044: cycle detected while executing recursive WITH query
). How would you query my original data to get those nodes 3, 4, 5, 6, 8 efficiently? PL/SQL solutions are also welcome.
Thank you.
Oracle processes hierarchical queries as follows: A join, if present, is evaluated first, whether the join is specified in the FROM clause or with WHERE clause predicates. The CONNECT BY condition is evaluated. Any remaining WHERE clause predicates are evaluated.
A recursive subquery factoring clause must contain two query blocks combined by a UNION ALL set operator. The first block is known as the anchor member, which can not reference the query name. It can be made up of one or more query blocks combined by the UNION ALL , UNION , INTERSECT or MINUS set operators.
Purpose. SYS_CONNECT_BY_PATH is valid only in hierarchical queries. It returns the path of a column value from root to node, with column values separated by char for each row returned by CONNECT BY condition. Both column and char can be any of the datatypes CHAR , VARCHAR2 , NCHAR , or NVARCHAR2 .
Use hierarchyid as a data type to create tables with a hierarchical structure, or to describe the hierarchical structure of data that is stored in another location. Use the hierarchyid functions in Transact-SQL to query and manage hierarchical data.
What is your expected maximum depth to reach any child node?
If it's relatively small, you could loop down, while checking for nodes you have already visited, in a manner something like this...
(Note, I'm not an Oracle expert so this is closer to pseudo code with a little real SQL mixed in)
CREATE TABLE myMap (parent INT, child INT);
INSERT INTO myTable SELECT NULL, 2 FROM DUAL;
WHILE (SQL%ROWCOUNT > 0)
LOOP
INSERT INTO
myMap
SELECT DISTINCT
dataMap.parent,
dataMap.child
FROM
myMap
INNER JOIN
dataMap
ON myMap.child = dataMap.parent
WHERE
NOT EXISTS (SELECT * FROM myMap WHERE parent = dataMap.parent)
END LOOP;
Depending on performance, you may also want a depth
field in myMap
; optimising the join so as to only join on the most recent nodes. This would imply two indexes; one for the JOIN (depth)
and one for the NOT EXISTS (parent)
.
EDIT
Added the DISTINCT key word, to avoid the following case...
- Node 2 maps to 3 and 4
- Nodes 3 and 4 both map to node 5
- All children of node 5 would now be processed twice
GROUP BY, or many other options, can be used to cater for this instead of DISTINCT. It's just that the NOT EXISTS on it's own is not sufficient.
I have not worked with this myself, but what about a CONNECT BY with the NOCYCLE option? That should stop travering the tree when it sees a loop. Oracle 11i definitely has that, I think it came in somewhere in the Oracle 10g period.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With