Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Detecting cycles in a recursive query

I have a directed graph in my PostgreSQL database, there can be multiple paths between nodes and cycles:

create table "edges" ("from" int, "to" int);
insert into "edges" values (0, 1), (1, 2), (2, 3), (3, 4), (1, 3);
insert into "edges" values (10, 11), (11, 12), (12, 11);

I'd like to find the minimum number of edges between a node and every node connected to it:

with recursive "nodes" ("node", "depth") as (
  select 0, 0
union
  select "to", "depth" + 1
  from "edges", "nodes"
  where "from" = "node"
) select * from "nodes";

Returns the depths for all paths:

 node  0 1 2 3 3 4 4 
 depth 0 1 2 2 3 3 4

 0 -> 1 -> 2 -> 3 -> 4
 0 -> 1 ------> 3 -> 4

I need the minimum, but aggregate functions are not allowed in a recursive query's recursive term

with recursive "nodes" ("node", "depth") as (
  select 0, 0
union
  select "to", min("depth") + 1
  from "edges", "nodes"
  where "from" = "node"
  group by "to"
) select * from "nodes";

Using an aggregate function on the result works though:

with recursive "nodes" ("node", "depth") as (
  select 0, 0
union all
  select "to", "depth" + 1
  from "edges", "nodes"
  where "from" = "node"
) select * from (select "node", min("depth")
                 from "nodes" group by "node") as n;

Returns as expected

node  0 1 2 3 4
depth 0 1 2 2 3

However, running into the cycle causes an infinite loop, and recursive reference to query "nodes" must not appear within a subquery, so I can't check if a node has already been visited:

with recursive "nodes" ("node", "depth") as (
  select 10, 0
union
  select "to", "depth" + 1
  from "edges", "nodes"
  where "from" = "node"
  and "to" not in (select "node" from "nodes")
) select * from "nodes";

The result I'm looking for here is

node  10 11 12
depth  0  1  2

Is there a way to do this with recursive queries / common table expressions?

The alternative would be to create a temporary table, iteratively adding rows until exhausted; i.e. a breath first search.

related: this answer checks if the node is contained in the path already and avoids loops, but still can't avoid doing the unnecessary work of checking out paths that are longer than the shortest known one because it's still behaving like a depth first search

like image 927
pascal Avatar asked Oct 21 '22 03:10

pascal


1 Answers

You can add cycle detection in your query based on documentation

with recursive "nodes" ("node", "depth", "path", "cycle") as (
  select 10, 0, ARRAY[10], false
union all
  select "to", "depth" + 1, path || "to", "to" = ANY(path)
  from "edges", "nodes"
  where "from" = "node" and not "cycle"
) select * from (select "node", min("depth"), "path", "cycle"
                from "nodes" group by "node", "path", "cycle") as n 
                where not "cycle";

This query will return data you expected

like image 158
gkocjan Avatar answered Oct 23 '22 04:10

gkocjan