Description
In our problem domain we are working on a set of edges that connect together forming a graph. Starting from a given node (or nodes), we have to list all links within the entire graph that are connected to the given node (or nodes). We have to show these links from left-to-right, top-to-bottom.
We have a working query for this problem for graphs with a limited number of loops. Higher number of loops increases the execution time exponentially.
We need to limit visits to the same node during recursion to have a working solution.
The example below contains only a single loop, but this single loop is already causing 86 additional and obsolete rows.
In similar posts solutions are provided for postgresql using ROW and ANY operators, but I could not find an Oracle solution.
We are searching for either an alternative for our solution or a way to limit the number of visits to the same edges.
Any help is greatly appreciated!
Similar
Visiting a directed graph as if it were an undirected one, using a recursive query provides a solution in postgresql. We are required to use Oracle11g.
Example
Edges
A-B, B-D, C-A, C-E, C-F, H-F, E-B, G-D, G-I
Graphical
A
/ \
C - E - B - D
\ /
H - F G - I
DDL and DML
CREATE TABLE EDGE (
FROM_ID VARCHAR(10),
TO_ID VARCHAR(10)
);
INSERT INTO EDGE VALUES ('A', 'B');
INSERT INTO EDGE VALUES ('E', 'B');
INSERT INTO EDGE VALUES ('C', 'E');
INSERT INTO EDGE VALUES ('C', 'A');
INSERT INTO EDGE VALUES ('C', 'F');
INSERT INTO EDGE VALUES ('B', 'D');
INSERT INTO EDGE VALUES ('G', 'D');
INSERT INTO EDGE VALUES ('H', 'F');
INSERT INTO EDGE VALUES ('G', 'I');
Input
nodes: 'A'
Required Output
C A
C E
C F
H F
A B
E B
B D
G D
G I
Current Solution
Our current solution returns exactly what we need, but as mentioned above each additional loop increases the execution time exponentially.
SELECT
c.LVL,
c.FROM_ID,
c.TO_ID,
CASE
WHEN lag(C.TO_ID)
OVER (
PARTITION BY C.LVL
ORDER BY C.LVL, C.TO_ID ) = C.TO_ID
THEN C.LVL || '-' || C.TO_ID
WHEN lead(C.TO_ID)
OVER (
PARTITION BY C.LVL
ORDER BY C.LVL, C.TO_ID ) = C.TO_ID
THEN C.LVL || '-' || C.TO_ID
ELSE C.LVL || '-' || C.FROM_ID
END GROUP_ID
FROM (
WITH chain(LVL, FROM_ID, TO_ID ) AS (
SELECT
1 LVL,
root.FROM_ID FROM_ID,
root.TO_ID TO_ID
FROM EDGE root
WHERE root.TO_ID IN (:nodes)
OR (root.FROM_ID IN (:nodes) AND NOT EXISTS(
SELECT *
FROM EDGE
WHERE TO_ID IN (:nodes)
))
UNION ALL
SELECT
LVL +
CASE
WHEN previous.TO_ID = the_next.FROM_ID
THEN 1
WHEN previous.TO_ID = the_next.TO_ID
THEN 0
WHEN previous.FROM_ID = the_next.FROM_ID
THEN 0
ELSE -1
END LVL,
the_next.FROM_ID FROM_ID,
the_next.TO_ID TO_ID
FROM EDGE the_next
JOIN chain previous ON previous.TO_ID = the_next.FROM_ID
OR the_next.TO_ID = previous.FROM_ID
OR (previous.TO_ID = the_next.TO_ID AND previous.FROM_ID <> the_next.FROM_ID)
OR (previous.TO_ID <> the_next.TO_ID AND previous.FROM_ID = the_next.FROM_ID)
)
SEARCH BREADTH FIRST BY FROM_ID SET ORDER_ID
CYCLE FROM_ID, TO_ID SET CYCLE TO 1 DEFAULT 0
SELECT
C.*,
row_number()
OVER (
PARTITION BY LVL, FROM_ID, TO_ID
ORDER BY ORDER_ID ) rank
FROM chain C
ORDER BY LVL, FROM_ID, TO_ID
) C
WHERE C.rank = 1;
For keeping the traversing algorithm out of returning to already visited edges, one can indeed keep the visited edges somewhere. As you already found out, you won't get much of a success with a string concatenation. However, there are other usable "value concatenation" techniques available...
You have to have one handy schema-level collection of scalars at your disposal:
create or replace type arr_strings is table of varchar2(64);
And then you can collect the visited edges to that collection in each iteration:
with nondirected$ as (
select from_id, to_id, from_id||'-'||to_id as edge_desc
from edge
where from_id != to_id
union all
select to_id, from_id, from_id||'-'||to_id as edge_desc
from edge
where (to_id, from_id) not in (
select from_id, to_id
from edge
)
),
graph$(lvl, from_id, to_id, edge_desc, visited_edges) as (
select 1, from_id, to_id, edge_desc,
arr_strings(edge_desc)
from nondirected$ R
where from_id in (&nodes)
--
union all
--
select
lvl+1,
Y.from_id, Y.to_id, Y.edge_desc,
X.visited_edges multiset union arr_strings(Y.edge_desc)
from graph$ X
join nondirected$ Y
on Y.from_id = X.to_id
where not exists (
select 1
from table(X.visited_edges) Z
where Y.edge_desc = Z.column_value
)
)
search breadth first by edge_desc set order_id
cycle edge_desc set is_cycle to 1 default 0,
ranked_graph$ as (
select C.*,
row_number() over (partition by edge_desc order by lvl, order_id) as rank$
from graph$ C
-- where is_cycle = 0
)
select *
from ranked_graph$
--where rank$ <= 1
order by lvl, order_id
;
Notes
union
-ing a set of reverse edges to the input. That should make the recursive traversal predicates easier to read. Solely for my purposes of easier reading+writing of the SQL. You don't have to do that, of course.Limiting the revisited edges to zero
In SQL, you can't. The PostgreSQL solution you mentioned does do it. In Oracle, however, you can't. You would have to, for each traversal join, test rows for all other traversal joins. And that would mean some kind of aggregation or analytics... which Oracle forbids and throws out an ORA exception.
PLSQL to the rescue?
You can do it in PL/SQL, though. How much performant it is supposed to be, depends on how much memory you want to spend on prefetching edges from DB vs. how many SQL roundtrips you are willing to take to traverse the graph from "current" nodes or if you are willing to use even more memory to keep the visited nodes in a fancy indexed-by-edge collection vs. if you rather anti-join against a regular arr_output
collection l_visited_nodes
. You have multiple choices, choose wisely.
Anyway, for the simplest scenario with heavier use of SQL engine, this might be the code you're looking for...
create or replace
package pkg_so_recursive_traversal
is
type rec_output is record (
from_id edge.from_id%type,
to_id edge.to_id%type,
lvl integer
);
type arr_output is table of rec_output;
function traverse_a_graph
( i_from in arr_strings
, i_is_directed in varchar2 default 'NO' )
return arr_output
pipelined;
end pkg_so_recursive_traversal;
/
create or replace
package body pkg_so_recursive_traversal
is
function traverse_a_graph
( i_from in arr_strings
, i_is_directed in varchar2 )
return arr_output
pipelined
is
l_next_edges arr_output;
l_current_edges arr_output;
l_visited_edges arr_output := arr_output();
l_out rec_output;
i pls_integer;
l_is_directed varchar2(32) := case when i_is_directed = 'YES' then 'YES' else 'NO' end;
begin
select E.from_id, E.to_id, 0
bulk collect into l_next_edges
from table(i_from) F
join edge E
on F.column_value in (E.from_id, case when l_is_directed = 'YES' then null else E.to_id end)
where E.from_id != E.to_id;
l_out.lvl := 0;
loop
dbms_output.put_line(l_next_edges.count());
exit when l_next_edges.count() <= 0;
l_out.lvl := l_out.lvl + 1;
-- spool the edges to output
i := l_next_edges.first();
while i is not null loop
l_out.from_id := l_next_edges(i).from_id;
l_out.to_id := l_next_edges(i).to_id;
pipe row(l_out);
i := l_next_edges.next(i);
end loop;
l_current_edges := l_next_edges;
l_visited_edges := l_visited_edges multiset union l_current_edges;
-- find next edges
select unique E.from_id, E.to_id, 0
bulk collect into l_next_edges
from table(l_current_edges) CE
join edge E
on CE.to_id in (E.from_id, case when l_is_directed = 'YES' then null else E.to_id end)
or l_is_directed = 'NO' and CE.from_id in (E.from_id, E.to_id)
where E.from_id != E.to_id
and not exists (
select 1
from table(l_visited_edges) VE
where VE.from_id = E.from_id
and VE.to_id = E.to_id
);
end loop;
return;
end;
end pkg_so_recursive_traversal;
/
When called for the starting node of A
and considering the graph to be undirected...
select *
from table(pkg_so_recursive_traversal.traverse_a_graph(
i_from => arr_strings('A'),
i_is_directed => 'NO'
));
... it yields...
FROM_ID TO_ID LVL
---------- ---------- ----------
A B 1
C A 1
C E 2
B D 2
C F 2
E B 2
G D 3
H F 3
G I 4
Notes
edge
table. That may or may not be a bigger performance hit when compared to the pure-SQL solution with redundant edge visiting. Test more solutions properly, see which one works for you the best.rec_output
and arr_output
types on the schema level.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With