I've created a simple example to illustrate transitive closure using recursive queries in PostgreSQL.
However, something is off with my recursive query. I'm not familiar with the syntax yet so this request may be entirely noobish of me, and for that I apologize in advance. If you run the query, you will see that node 1 repeats itself in the path results. Can someone please help me figure out how to tweak the SQL?
/* 1
/ \
2 3
/ \ /
4 5 6
/
7
/ \
8 9
*/
create table account(
acct_id INT,
parent_id INT REFERENCES account(acct_id),
acct_name VARCHAR(100),
PRIMARY KEY(acct_id)
);
insert into account (acct_id, parent_id, acct_name) values (1,1,'account 1');
insert into account (acct_id, parent_id, acct_name) values (2,1,'account 2');
insert into account (acct_id, parent_id, acct_name) values (3,1,'account 3');
insert into account (acct_id, parent_id, acct_name) values (4,2,'account 4');
insert into account (acct_id, parent_id, acct_name) values (5,2,'account 5');
insert into account (acct_id, parent_id, acct_name) values (6,3,'account 6');
insert into account (acct_id, parent_id, acct_name) values (7,4,'account 7');
insert into account (acct_id, parent_id, acct_name) values (8,7,'account 8');
insert into account (acct_id, parent_id, acct_name) values (9,7,'account 9');
WITH RECURSIVE search_graph(acct_id, parent_id, depth, path, cycle) AS (
SELECT g.acct_id, g.parent_id, 1,
ARRAY[g.acct_id],
false
FROM account g
UNION ALL
SELECT g.acct_id, g.parent_id, sg.depth + 1,
path || g.acct_id,
g.acct_id = ANY(path)
FROM account g, search_graph sg
WHERE g.acct_id = sg.parent_id AND NOT cycle
)
SELECT path[1] as Child,parent_id as Parent,path || parent_id as path FROM search_graph
ORDER BY path[1],depth;
A recursive query is one that is defined by a Union All with an initialization fullselect that seeds the recursion. The iterative fullselect contains a direct reference to itself in the FROM clause.
Recursive CTEs enable you to process hierarchical data, such as a parts explosion (component, sub-components) or a management hierarchy (manager, employees). For more information about hierarchical data, and other ways to query hierarchical data, see Querying Hierarchical Data.
A recursive CTE is a subquery which refer to itself using its own name. The recursive CTEs are defined using WITH RECURSIVE clause. There should be a terminating condition to recursive CTE. The recursive CTEs are used for series generation and traversal of hierarchical or tree-structured data.
The computation of transitive closure is the determination of the existence of a path between nodes in a graph. Our choice is motivated from two perspectives. Firstly, transitive closure is the canonical representative of recursive queries. It is known to be inexpressible in relational calculus and SQL 2, 23].
You can simplify in several places (assuming acct_id
and parent_id
are NOT NULL
):
WITH RECURSIVE search_graph AS (
SELECT parent_id, ARRAY[acct_id] AS path
FROM account
UNION ALL
SELECT g.parent_id, sg.path || g.acct_id
FROM search_graph sg
JOIN account g ON g.acct_id = sg.parent_id
WHERE g.acct_id <> ALL(sg.path)
)
SELECT path[1] AS child
, path[array_upper(path,1)] AS parent
, path
FROM search_graph
ORDER BY path;
acct_id
, depth
, cycle
are just noise in your query.WHERE
condition has to exit the recursion one step earlier, before the duplicate entry from the top node is in the result. That was an "off-by-one" in your original.The rest is formatting.
If you know the only possible circle in your graph is a self-reference, we can have that cheaper:
WITH RECURSIVE search_graph AS (
SELECT parent_id, ARRAY[acct_id] AS path, acct_id <> parent_id AS keep_going
FROM account
UNION ALL
SELECT g.parent_id, sg.path || g.acct_id, g.acct_id <> g.parent_id
FROM search_graph sg
JOIN account g ON g.acct_id = sg.parent_id
WHERE sg.keep_going
)
SELECT path[1] AS child
, path[array_upper(path,1)] AS parent
, path
FROM search_graph
ORDER BY path;
SQL Fiddle.
Note there would be problems (at least up to pg v9.4) for data types with a modifier (like varchar(5)
) because array concatenation loses the modifier but the rCTE insists on types matching exactly:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With