How to select using WITH RECURSIVE clause [closed]

Tags:

I have googled and read throug some articles like this postgreSQL manual page or this blog page and tried making queries myself with a moderate success (part of them hangs, while others works good and fast), but so far I can not completely understand how this magic works.

Can anybody give very clear explanation demonstrating such query semantics and execution process, better based on typical samples like factorial calculation or full tree expansion from (id,parent_id,name) table?

And what are the basic guidlines and typical mistakes that one should know to make good with recursive queries?

919

asked Sep 06 '13 14:09

user2754703

1 Answers

First of all, let us try to simplify and clarify algorithm description given on the manual page. To simplify it consider only union all in with recursive clause for now (and union later):

WITH RECURSIVE pseudo-entity-name(column-names) AS (     Initial-SELECT UNION ALL     Recursive-SELECT using pseudo-entity-name ) Outer-SELECT using pseudo-entity-name

To clarify it let us describe query execution process in pseudo code:

working-recordset = result of Initial-SELECT  append working-recordset to empty outer-recordset  while( working-recordset is not empty ) begin      new working-recordset = result of Recursive-SELECT          taking previous working-recordset as pseudo-entity-name      append working-recordset to outer-recordset  end  overall-result = result of Outer-SELECT      taking outer-recordset as pseudo-entity-name

Or even shorter - Database engine executes initial select, taking its result rows as working set. Then it repeatedly executes recursive select on the working set, each time replacing contents of the working set with query result obtained. This process ends when empty set is returned by recursive select. And all result rows given firstly by initial select and then by recursive select are gathered and feeded to outer select, which result becomes overall query result.

This query is calculating factorial of 3:

WITH RECURSIVE factorial(F,n) AS (     SELECT 1 F, 3 n UNION ALL     SELECT F*n F, n-1 n from factorial where n>1 ) SELECT F from factorial where n=1

Initial select SELECT 1 F, 3 n gives us initial values: 3 for argument and 1 for function value.
Recursive select SELECT F*n F, n-1 n from factorial where n>1 states that every time we need to multiply last funcion value by last argument value and decrement argument value.
Database engine executes it like this:

First of all it executes initail select, which gives the initial state of working recordset:

F | n --+-- 1 | 3

Then it transforms working recordset with recursive query and obtain its second state:

F | n --+-- 3 | 2

Then third state:

F | n --+-- 6 | 1

In the third state there is no row which follows n>1 condition in recursive select, so forth working set is loop exits.

Outer recordset now holds all the rows, returned by initial and recursive select:

F | n --+-- 1 | 3 3 | 2 6 | 1

Outer select filters out all intermediate results from outer recordset, showing only final factorial value which becomes overall query result:

F  -- 6

And now let us consider table forest(id,parent_id,name):

id | parent_id | name ---+-----------+----------------- 1  |           | item 1 2  | 1         | subitem 1.1 3  | 1         | subitem 1.2 4  | 1         | subitem 1.3 5  | 3         | subsubitem 1.2.1 6  |           | item 2 7  | 6         | subitem 2.1 8  |           | item 3

'Expanding full tree' here means sorting tree items in human-readable depth-first order while calculating their levels and (maybe) paths. Both tasks (of correct sorting and calculating level or path) are not solvable in one (or even any constant number of) SELECT without using WITH RECURSIVE clause (or Oracle CONNECT BY clause, which is not supported by PostgreSQL). But this recursive query does the job (well, almost does, see the note below):

WITH RECURSIVE fulltree(id,parent_id,level,name,path) AS (     SELECT id, parent_id, 1 as level, name, name||'' as path from forest where parent_id is null UNION ALL     SELECT t.id, t.parent_id, ft.level+1 as level, t.name, ft.path||' / '||t.name as path     from forest t, fulltree ft where t.parent_id = ft.id ) SELECT * from fulltree order by path

Database engine executes it like this:

Firstly, it executes initail select, which gives all highest level items (roots) from forest table:

id | parent_id | level | name             | path ---+-----------+-------+------------------+---------------------------------------- 1  |           | 1     | item 1           | item 1 8  |           | 1     | item 3           | item 3 6  |           | 1     | item 2           | item 2

Then, it executes recursive select, which gives all 2nd level items from forest table:

id | parent_id | level | name             | path ---+-----------+-------+------------------+---------------------------------------- 2  | 1         | 2     | subitem 1.1      | item 1 / subitem 1.1 3  | 1         | 2     | subitem 1.2      | item 1 / subitem 1.2 4  | 1         | 2     | subitem 1.3      | item 1 / subitem 1.3 7  | 6         | 2     | subitem 2.1      | item 2 / subitem 2.1

Then, it executes recursive select again, retrieving 3d level items:

id | parent_id | level | name             | path ---+-----------+-------+------------------+---------------------------------------- 5  | 3         | 3     | subsubitem 1.2.1 | item 1 / subitem 1.2 / subsubitem 1.2.1

And now it executes recursive select again, trying to retrieve 4th level items, but there are none of them, so the loop exits.

The outer SELECT sets the correct human-readable row order, sorting on path column:

id | parent_id | level | name             | path ---+-----------+-------+------------------+---------------------------------------- 1  |           | 1     | item 1           | item 1 2  | 1         | 2     | subitem 1.1      | item 1 / subitem 1.1 3  | 1         | 2     | subitem 1.2      | item 1 / subitem 1.2 5  | 3         | 3     | subsubitem 1.2.1 | item 1 / subitem 1.2 / subsubitem 1.2.1 4  | 1         | 2     | subitem 1.3      | item 1 / subitem 1.3 6  |           | 1     | item 2           | item 2 7  | 6         | 2     | subitem 2.1      | item 2 / subitem 2.1 8  |           | 1     | item 3           | item 3

NOTE: Resulting row order will remain correct only while there are no punctuation characters collation-preceeding / in the item names. If we rename Item 2 in Item 1 *, it will break row order, standing between Item 1 and its descendants.
More stable solution is using tab character (E'\t') as path separator in query (which can be substituted by more readable path separator later: in outer select, before displaing to human or etc). Tab separated paths will retain correct order until there are tabs or control characters in the item names - which easily can be checked and ruled out without loss of usability.

It is very simple to modify last query to expand any arbitrary subtree - you need only to substitute condition parent_id is null with perent_id=1 (for example). Note that this query variant will return all levels and paths relative to Item 1.

And now about typical mistakes. The most notable typical mistake specific to recursive queries is defining ill stop conditions in recursive select, which results in infinite looping.

For example, if we omit where n>1 condition in factorial sample above, execution of recursive select will never give an empty set (because we have no condition to filter out single row) and looping will continue infinitely.

That is the most probable reason why some of your queries hang (the other non-specific but still possible reason is very ineffective select, which executes in finite but very long time).

There are not much RECURSIVE-specific querying guidlines to mention, as far as I know. But I would like to suggest (rather obvious) step by step recursive query building procedure.

Separately build and debug your initial select.
Wrap it with scaffolding WITH RECURSIVE construct
and begin building and debugging your recursive select.

The recommended scuffolding construct is like this:

WITH RECURSIVE rec( <Your column names> ) AS (     <Your ready and working initial SELECT> UNION ALL     <Recursive SELECT that you are debugging now> ) SELECT * from rec limit 1000

This simplest outer select will output the whole outer recordset, which, as we know, contains all output rows from initial select and every execution of recusrive select in a loop in their original output order - just like in samples above! The limit 1000 part will prevent hanging, replacing it with oversized output in which you will be able to see the missed stop point.

After debugging initial and recursive select build and debug your outer select.

And now the last thing to mention - the difference in using union instead of union all in with recursive clause. It introduces row uniqueness constraint which results in two extra lines in our execution pseudocode:

working-recordset = result of Initial-SELECT  discard duplicate rows from working-recordset /*union-specific*/  append working-recordset to empty outer-recordset  while( working-recordset is not empty ) begin      new working-recordset = result of Recursive-SELECT          taking previous working-recordset as pseudo-entity-name      discard duplicate rows and rows that have duplicates in outer-recordset          from working-recordset /*union-specific*/      append working-recordset to outer-recordset  end  overall-result = result of Outer-SELECT      taking outer-recordset as pseudo-entity-name

139

answered Sep 22 '22 06:09

mas.morozov

Related questions
                            
                                Pros and cons of the use of the DAO pattern [closed]
                            
                                How to join two tables mysql?
                            
                                SQL: Select dynamic column name based on variable
                            
                                SQL query in Spark/scala Size exceeds Integer.MAX_VALUE
                            
                                Select average from MySQL table with LIMIT
                            
                                Compare Columns Where One is Similar to Part of Another
                            
                                Query several NEXTVAL from sequence in one statement
                            
                                MySQL get mindate and maxdate in one query
                            
                                Using a view with no primary key with Entity
                            
                                SQLite - how to get average value?
                            
                                Can I use a MySQL database with an App Engine application
                            
                                How do I edit a table in order to enable CASCADE DELETE?
                            
                                T-SQL Insert into table without having to specify every column
                            
                                Does SQLite3 not support foreign key constraints?
                            
                                MS Sql: Conditional ORDER BY ASC/DESC Question
                            
                                how to see generated sql from a linq query
                            
                                Database structure for "customer" table having many orders per customer and many items per order
                            
                                Where did Open Table go in SQL Server 2008?
                            
                                Core Data Performance with Single Parent Entity
                            
                                'CREATE VIEW' must be the first statement in a query batch

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to select using WITH RECURSIVE clause [closed]

Tags:

sql

postgresql

recursive-query

user2754703

People also ask

1 Answers

mas.morozov

Recent Activity

Donate For Us