Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why Oracle connect by with nocycle follows root cycle

Does anyone know why Oracle continues to follow a path beyond a cyclical loop when the cycle occurs at the top node (root node connected right back to root node)? More importantly, how to prevent it?

I have Oracle 11g Release 2 (11.2) and I have been exploring hierarchical queries. I will build my question around the tree structure in figure 9-1 of the Oracle Database SQL Language Reference page 9-4

I created a table structe for this tree using the concept of vendors and cusomers:

    create table t
     ( vendor       varchar2(3)
    , customer   varchar2(3)
    );
    insert into t values ( '1'  , '2'  ); 
    insert into t values ( '2'  , '3'  ); 
    insert into t values ( '2'  , '4'  ); 
    insert into t values ( '4'  , '5'  ); 
    insert into t values ( '4'  , '6'  ); 
    insert into t values ( '1'  , '7'  ); 
    insert into t values ( '7'  , '8'  ); 
    insert into t values ( '1'  , '9'  ); 
    insert into t values ( '9'  , '10' ); 
    insert into t values ( '10' , '11' ); 
    insert into t values ( '9'  , '12' ); 
    commit;

The following select query traverses the tree with no problems:

    select vendor, 
           customer, 
           level, 
           connect_by_isleaf as isleaf, 
           connect_by_iscycle as iscycle, 
           connect_by_root vendor||sys_connect_by_path(customer,' ~ ') as path 
    from t
    connect by nocycle
          vendor=prior customer
    start with vendor='1';

Giving the results:

Vendor Cust     Level   Isleaf Iscycle   Path
1        2        1        0        0   1 ~ 2
2        3        2        1        0   1 ~ 2 ~ 3
2        4        2        0        0   1 ~ 2 ~ 4
4        5        3        1        0   1 ~ 2 ~ 4 ~ 5
4        6        3        1        0   1 ~ 2 ~ 4 ~ 6
1        7        1        0        0   1 ~ 7
7        8        2        1        0   1 ~ 7 ~ 8
1        9        1        0        0   1 ~ 9
9        10       2        0        0   1 ~ 9 ~ 10
10       11       3        1        0   1 ~ 9 ~ 10 ~ 11
9        12       2        1        0   1 ~ 9 ~ 12

I then complicated things by adding cycles to the structure. First a record for a vendor who sells to themselves…

    --self cycle
    insert into t values ( '4'  , '4'  ); 

and one for a vendor whos customer is the vendor of their vendor…

    --ancestor cycle
    insert into t values ( '6'  , '2'  ); 

Reexecuting the select query above results in the same output as above except Iscycle is 1 for row 3 and row 5 (Paths 1 ~ 2 ~ 4 and 1 ~ 2 ~ 4 ~ 6). Note that the CONNECT BY nomenclature flags the parent record of a cycle not the child record actually completing the cycle. (So I know 4 and 6 both cycle back to an ancestor but I don’t know WHICH ancestor.)

Adding two more records creates a larger cycle across the branches of the original tree:

 --cycle crossing branches of tree
  insert into t values ( '6'  , '9'  ); 
  insert into t values ( '11' , '2'  );  

Reexecuting the select query again gives the following output:

Vendor Customer Level   Isleaf   Iscycle       Path
1        2        1        0        0    1 ~ 2
2        3        2        1        0    1 ~ 2 ~ 3
2        4        2        0        1    1 ~ 2 ~ 4
4        5        3        1        0    1 ~ 2 ~ 4 ~ 5
4        6        3        0        1    1 ~ 2 ~ 4 ~ 6
6        9        4        0        0    1 ~ 2 ~ 4 ~ 6 ~ 9
9       10        5        0        0    1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 10
10      11        6        1        1    1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 10 ~ 11
9       12        5        1        0    1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 12
1        7        1        0        0    1 ~ 7
7        8        2        1        0    1 ~ 7 ~ 8
1        9        1        0        0    1 ~ 9
9       10        2        0        0    1 ~ 9 ~ 10
10      11        3        0        0    1 ~ 9 ~ 10 ~ 11
11       2        4        0        0    1 ~ 9 ~ 10 ~ 11 ~ 2
2        3        5        1        0    1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 3
2        4        5        0        1    1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4
4        5        6        1        0    1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4 ~ 5
4        6        6        1        1    1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4 ~ 6
9       12        2        1        0    1 ~ 9 ~ 12

The output continues to be as expected. All cycles are flaged and the mapping stops when a cycle is encountered.

Now the problem child… Let’s add a self cycle to the root node which is exactly the same as the first cycle created above with node 4; just for node 1.

    insert into t values ( '1'  , '1'  );

This time Oracle detects the cycle at node 1, as expected (first row is flagged with Iscycle set to 1); HOWEVER, it continues past this cycle and builds out the entire tree structure twice. Rows 2 through 21 are a duplication of rows 22 through 41 with the cycle of node 1 prepended onto the front of the path.

Vendor Customer  Level Isleaf Iscycle    Path
1        1        1        0    1      1 ~ 1
1        2        2        0    0      1 ~ 1 ~ 2
2        3        3        1    0      1 ~ 1 ~ 2 ~ 3
2        4        3        0    1      1 ~ 1 ~ 2 ~ 4
4        5        4        1    0      1 ~ 1 ~ 2 ~ 4 ~ 5
4        6        4        0    1      1 ~ 1 ~ 2 ~ 4 ~ 6
6        9        5        0    0      1 ~ 1 ~ 2 ~ 4 ~ 6 ~ 9
9       10        6        0    0      1 ~ 1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 10
10      11        7        1    1      1 ~ 1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 10 ~ 11
9       12        6        1    0      1 ~ 1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 12
1        7        2        0    0      1 ~ 1 ~ 7
7        8        3        1    0      1 ~ 1 ~ 7 ~ 8
1        9        2        0    0      1 ~ 1 ~ 9
9       10        3        0    0      1 ~ 1 ~ 9 ~ 10
10      11        4        0    0      1 ~ 1 ~ 9 ~ 10 ~ 11
11       2        5        0    0      1 ~ 1 ~ 9 ~ 10 ~ 11 ~ 2
2        3        6        1    0      1 ~ 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 3
2        4        6        0    1      1 ~ 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4
4        5        7        1    0      1 ~ 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4 ~ 5
4        6        7        1    1      1 ~ 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4 ~ 6
9       12        3        1    0      1 ~ 1 ~ 9 ~ 12
1        2        1        0    0      1 ~ 2
2        3        2        1    0      1 ~ 2 ~ 3
2        4        2        0    1      1 ~ 2 ~ 4
4        5        3        1    0      1 ~ 2 ~ 4 ~ 5
4        6        3        0    1      1 ~ 2 ~ 4 ~ 6
6        9        4        0    0      1 ~ 2 ~ 4 ~ 6 ~ 9
9       10        5        0    0      1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 10
10      11        6        1    1      1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 10 ~ 11
9       12        5        1    0      1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 12
1        7        1        0    0      1 ~ 7
7        8        2        1    0      1 ~ 7 ~ 8
1        9        1        0    0      1 ~ 9
9       10        2        0    0      1 ~ 9 ~ 10
10      11        3        0    0      1 ~ 9 ~ 10 ~ 11
11       2        4        0    0      1 ~ 9 ~ 10 ~ 11 ~ 2
2        3        5        1    0      1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 3
2        4        5        0    1      1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4
4        5        6        1    0      1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4 ~ 5
4        6        6        1    1      1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4 ~ 6
9       12        2        1    0      1 ~ 9 ~ 12

Why isn’t the 1-1 cycle treated the same as the 4-4 cycle? What am I missing?

To mitigate against this I added an additional condition on the CONNECT BY clause requiring that the customer not be ‘1’.

    select vendor, 
           customer, 
           level, 
           connect_by_isleaf as isleaf, 
           connect_by_iscycle as iscycle, 
           connect_by_root vendor||sys_connect_by_path(customer,' ~ ') as path 
    from t
    connect by nocycle
          vendor=prior customer
          and customer<>'1' 
    start with vendor='1';

Ironically, all this did was REMOVE the cycle flag from row one.

Any help would be appreciated.

like image 438
Larry May Avatar asked Sep 20 '13 21:09

Larry May


People also ask

What is the use of Connect by in Oracle?

CONNECT BY specifies the relationship between parent rows and child rows of the hierarchy. The NOCYCLE parameter instructs Oracle Database to return rows from a query even if a CONNECT BY LOOP exists in the data. Use this parameter along with the CONNECT_BY_ISCYCLE pseudocolumn to see which rows contain the loop.

What is the use of Connect by Prior in Oracle?

CONNECT BY: It specifies the relationship between parent rows and child rows of the hierarchy. PRIOR: It's a unary operator and it is used to achieve the recursive condition i.e the actual walking.

How do I resolve ORA 01436 Connect by loop in user data?

Question: How do I fix a ORA-01436 error? Cause: The condition specified in a CONNECT BY clause caused a loop in the query, where the next record to be selected is a descendent of itself. When this happens, there can be no end to the query. Action: Check the CONNECT BY clause and remove the circular reference.

What is connect by level in Oracle?

The CONNECT BY clause defines the hierarchical relationship between the parent rows and the child rows of the hierarchy. DUAL is a dummy table automatically generated by Oracle database along with data dictionary. Example-1: SELECT Level AS Sequence.

Is Leaf connect by?

CONNECT_BY_ISLEAF is a pseudocolumn that returns a 1 if the row is a leaf in the hierarchy as defined by the CONNECT BY clause. A node is a leaf node if it has no children in the query result hierarchy (not in the actual data hierarchy). If the row is not a leaf the column returns 0 .


2 Answers

Oracle selects the root row(s) of the hierarchy (those rows that satisfy the START WITH condition.) Oracle selects the child rows of each root row. Each child row must satisfy the condition of the CONNECT BY condition with respect to one of the root rows.

To find the children of a parent row, Oracle evaluates the PRIOR expression of the CONNECT BY condition for the parent row and the other expression for each row in the table. Rows for which the condition is true are the children of the parent. The CONNECT BY condition can contain other conditions to further filter the rows selected by the query.

A root row is the highest row within an inverted tree. 

If you try with same parent as child (22 or 33 or 44) it will work since they are not root rows and just parents Since 1 is the root and also a child with 1, the LEVEL is set to be cycle due to CONNECT_BY_ROOT clause

The duplication in output occurs since the connect by works on root which is duplicated as well.

Oracle is not able to restrict the uniqueness since Oracle can't give preference to one of the other

Either make your data set unique or code them such that oracle can work on preference in hierarchy

FOLLOW UP: SOLUTION FOR OP's problem

SELECT
      VENDOR,
      CUSTOMER,
      LEVEL,
      CONNECT_BY_ISLEAF AS ISLEAF,
      CONNECT_BY_ISCYCLE AS ISCYCLE,
      CONNECT_BY_ROOT VENDOR
      || SYS_CONNECT_BY_PATH ( CUSTOMER,
                          ' ~ ' )
          AS PATH
FROM
      (SELECT
            VENDOR,
            CUSTOMER
       FROM
            T
       WHERE
            CUSTOMER <> '1')
CONNECT BY
      NOCYCLE VENDOR = PRIOR CUSTOMER
START WITH
      VENDOR = '1';

Results:

VENDOR CUSTOMER      LEVEL     ISLEAF    ISCYCLE PATH                                                                            
------ -------- ---------- ---------- ------------------------------------------------------------------------------------------
1      2                 1          0          0 1 ~ 2                                                                           
2      3                 2          1          0 1 ~ 2 ~ 3                                                                       
2      4                 2          0          1 1 ~ 2 ~ 4                                                                       
4      5                 3          1          0 1 ~ 2 ~ 4 ~ 5                                                                   
4      6                 3          0          1 1 ~ 2 ~ 4 ~ 6                                                                   
6      9                 4          0          0 1 ~ 2 ~ 4 ~ 6 ~ 9                                                               
9      10                5          0          0 1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 10                                                          
10     11                6          1          1 1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 10 ~ 11                                                     
9      12                5          1          0 1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 12                                                          
1      7                 1          0          0 1 ~ 7                                                                           
7      8                 2          1          0 1 ~ 7 ~ 8                                                                       
1      9                 1          0          0 1 ~ 9                                                                           
9      10                2          0          0 1 ~ 9 ~ 10                                                                      
10     11                3          0          0 1 ~ 9 ~ 10 ~ 11                                                                 
11     2                 4          0          0 1 ~ 9 ~ 10 ~ 11 ~ 2                                                             
2      3                 5          1          0 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 3                                                         
2      4                 5          0          1 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4                                                         
4      5                 6          1          0 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4 ~ 5                                                     
4      6                 6          1          1 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4 ~ 6                                                     
9      12                2          1          0 1 ~ 9 ~ 12                                                                      

20 rows selected
like image 171
SriniV Avatar answered Oct 19 '22 09:10

SriniV


I would agree with @realspirituals's initial part of explanation about how Oracle deals with hierarchical data. In my vision the first step taken is to find root elements of the trees specified by START WITH clause. This might be rephrased to following query:

select * from t where vendor = '1';
VENDOR  CUSTOMER
------------------
1   2
1   7
1   9
1   1

So actually we have 4 root nodes and 4 separate trees. Next steps are to iteratively evaluate CONNECT BY clause. Imagine that we take above list of CUSTOMER values and seeking for their descendants:

select * from t where vendor in ('2', '7', '9', '1');
VENDOR  CUSTOMER
------------------
1   2
2   3
2   4
1   7
7   8
1   9
9   10
9   12
1   1 --This one is loop and is not taken to final resultset

As soon as we specified NOCYCLE, detected loops are are thrown away and previous row that led us to the loop record is marked as CONNECT_BY_ISCYCLE = 1.

The third step:

select * from t where vendor in ('2', '3', '4', '7', '8', '9', '10', '12');
VENDOR  CUSTOMER
------------------
2   3
2   4
4   5
4   6
7   8
9   10
10  11
9   12
4   4 --This one is loop

So it goes until there is at least one record in output. It takes some time and patience but results returned by your query are fully reproduceable and seems absolutely legitimate to me. That is the way Oracle's algorythm works so everyone just have to keep it in mind when writing queries.

How can we avoid cycle on the top level node? I would suggest to add virtual record that will make our top level node not the top one. Consider this:

insert into t values(null, '1');

select vendor, 
       customer, 
       level, 
       connect_by_isleaf as isleaf, 
       connect_by_iscycle as iscycle, 
       connect_by_root vendor||sys_connect_by_path(customer,' ~ ') as path 
from t
connect by nocycle
      vendor=prior customer
start with vendor is null; --Note the changed condition

Vendor Customer Level   Isleaf  Iscycle  Path
------------------------------------------------------------
        1       1       0       1        ~ 1
1       2       2       0       0        ~ 1 ~ 2
2       3       3       1       0        ~ 1 ~ 2 ~ 3
2       4       3       0       1        ~ 1 ~ 2 ~ 4
4       5       4       1       0        ~ 1 ~ 2 ~ 4 ~ 5
4       6       4       0       1        ~ 1 ~ 2 ~ 4 ~ 6
6       9       5       0       0        ~ 1 ~ 2 ~ 4 ~ 6 ~ 9
9       10      6       0       0        ~ 1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 10
10      11      7       1       1        ~ 1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 10 ~ 11
9       12      6       1       0        ~ 1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 12
1       7       2       0       0        ~ 1 ~ 7
7       8       3       1       0        ~ 1 ~ 7 ~ 8
1       9       2       0       0        ~ 1 ~ 9
9       10      3       0       0        ~ 1 ~ 9 ~ 10
10      11      4       0       0        ~ 1 ~ 9 ~ 10 ~ 11
11      2       5       0       0        ~ 1 ~ 9 ~ 10 ~ 11 ~ 2
2       3       6       1       0        ~ 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 3
2       4       6       0       1        ~ 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4
4       5       7       1       0        ~ 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4 ~ 5
4       6       7       1       1        ~ 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4 ~ 6
9       12      3       1       0        ~ 1 ~ 9 ~ 12

Of course it mignt be not appropriate to add new records to production database. Rather combine query to real table with some query that dynamically determines top level nodes. Something like that (giving the same output as above):

delete from t where vendor is null; --Removing previosly inserted record

select vendor, 
       customer, 
       level, 
       connect_by_isleaf as isleaf, 
       connect_by_iscycle as iscycle, 
       connect_by_root vendor||sys_connect_by_path(customer,' ~ ') as path 
from (select vendor, customer from t
      union all
      select distinct null, vendor from t
      where vendor = 1) --Here is your START WITH condition
connect by nocycle
      vendor=prior customer
start with vendor is null;
like image 24
Yaroslav Shabalin Avatar answered Oct 19 '22 09:10

Yaroslav Shabalin