Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to properly label branches of a tree in a depth first search

I have a tree with a structure like this:

     __2__3__4
    /   \__5__6
0__1___7/__8__9
   \\
    \\__10__11__12
     \__  __  __
        13  14  15

Node 1 has four children (2,7,10,13), nodes 2 and 7 have two children each (both sharing node 5 as a child). What I am trying to do is create a CTE that provide records containing the parent node, the node, the distance away from the root, and the branch (or fork) its contained in.

IF (OBJECT_ID('tempdb..#Discovered') IS NOT NULL)
BEGIN
    DROP TABLE #Discovered
END

CREATE TABLE #Discovered
(
    ID int PRIMARY KEY NOT NULL,
    Predecessor int NULL,
    OrderDiscovered int
);

INSERT INTO #Discovered (ID, Predecessor, OrderDiscovered)
VALUES (@nodeId, NULL, 0);

    --loop through node connections table in a breadth first manner
WHILE @@ROWCOUNT > 0
BEGIN
    INSERT INTO #Discovered (ID, Predecessor, OrderDiscovered)
    SELECT c.node2_id
               ,MIN(c.node1_id)
               ,MIN(d.OrderDiscovered) + 1

    FROM #Discovered d JOIN node_connections c ON d.ID = c.node1_id
    WHERE c.node2_id NOT IN (SELECT ID FROM #Discovered)
    GROUP BY c.node2_id
END;

SELECT * FROM #Discovered;

WITH BacktraceCTE(Id, Predecessor, OrderDiscovered, Path, fork)

 AS 

 (  

     SELECT d.ID, d.Predecessor, d.OrderDiscovered, CAST(d.ID AS varchar(MAX)), 0

     FROM #Discovered d

     WHERE d.Id = @itemId


     UNION ALL             

     -- Recursive member, select all the nodes which have the previous

     SELECT d.ID, d.Predecessor, d.OrderDiscovered,  

         CAST(cte.Path + '->' + CAST(d.ID as varchar(10)) as varchar(MAX)),
         fork + CONVERT ( Integer, ROW_NUMBER() OVER (ORDER BY d.ID)) - 1

     FROM #Discovered d JOIN BacktraceCTE cte ON d.Predecessor = cte.ID

 )          

 SELECT  Predecessor as node1_id, OrderDiscovered as hop, fork, ID as node2_id, Path FROM BacktraceCTE  
 ORDER BY fork, OrderDiscovered;

The problem is with how the fork is calculated. Every time the CTE returns back to a previous level it only has the row numbers available and what the fork number was at that level. What I'd like to achieve is records where each combination of hop and fork are unique.

However, with the above code I'll get results that say node 2 to 5 end up being hop 3 fork 1 AND node 7 to 5 also end up being hop 3 fork 1.

Here is the tree again with the labeling of the branches with how they should appear:

     __2__3__4      :0
    /   \__5__6     :1,2
0__1___7/__8__9     :3
   \\
    \\__10__11__12  :4
     \__  __  __
        13  14  15  :5

As you can see for forks 1 and 2 I think that the best method would be to count the branch twice giving it its own identifier thus preserving the uniqueness of the combination of hop and fork.

Please help me figure out what I need to do in order to achieve this. I feel this should be possible with a CTE but perhaps I am wrong and if I am I'd love to know what the better method to tackle this would be.

EDIT The result set would look like:

Predecessor, ID, Order Discovered, Path, Fork

  • null, 0, 0, 0, 0

  • 0, 1, 1, 0->1, 0

  • 1, 2, 2, 0->1->2, 0

  • 2, 3, 3, 0->1->2->3, 0

  • 3, 4, 4, 0->1->2->3->4, 0

  • 2, 5, 3, 0->1->2->5, 1

  • 5, 6, 4, 0->1->2->5->6, 1

  • 1, 7, 2, 0->1->7, 2

  • 7, 5, 3, 0->1->7->5, 2

  • 5, 6, 4, 0->1->7->5->6, 2

  • 7, 8, 3, 0->1->7->8, 3

  • 8, 9, 4, 0->1->7->8->9, 3

  • 1, 10, 2, 0->1->10, 4

  • 10, 11, 3, 0->1->10->11, 4

  • 11, 12, 4, 0->1->10->11->12, 4

  • 1, 13, 2, 0->1->13, 5

  • 13, 14, 3, 0->1->13->14, 5

  • 14, 15, 4, 0->1->13->14->15, 5

like image 629
Kavet Kerek Avatar asked Jan 18 '12 15:01

Kavet Kerek


1 Answers

Okay, I'll try to refrain from tweaking this answer again. It's just been so fun learning about the sort order of VarBinary, finding the POWER function, CTEs beating on one another, ... .

Are you looking for anything like:

declare @Nodes as Table ( NodeId Int Identity(0,1), Name VarChar(10) )
declare @Relations as Table ( ParentNodeId Int, ChildNodeId Int, SiblingOrder Int )
insert into @Nodes ( Name ) values
--  ( '0' ), ( '1' ), ( '2' ), ( '3' ), ( '4' ), ( '5' ), ( '6' ), ( '7' ), ( '8' ),
--  ( '9' ), ( '10' ), ( '11' ), ( '12' ), ( '13' ), ( '14' ), ( '15' )
  ( 'zero' ), ( 'one' ), ( 'two' ), ( 'three' ), ( 'four' ), ( 'five' ), ( 'six' ), ( 'seven' ), ( 'eight' ),
  ( 'nine' ), ( 'ten' ), ( 'eleven' ), ( 'twelve' ), ( 'thirteen' ), ( 'fourteen' ), ( 'fifteen' )

insert into @Relations ( ParentNodeId, ChildNodeId, SiblingOrder ) values
  ( 0, 1, 0 ),
  ( 1, 2, 0 ), ( 1, 7, 1 ), ( 1, 10, 2 ), ( 1, 13, 3 ),
  ( 2, 3, 0 ), ( 2, 5, 1 ),
  ( 3, 4, 0 ),
  ( 5, 6, 0 ),
  ( 7, 5, 0 ), ( 7, 8, 1 ),
  ( 8, 9, 0 ),
  ( 10, 11, 0 ),
  ( 11, 12, 0 ),
  ( 13, 14, 0 ),
  ( 14, 15, 0 )

declare @MaxSiblings as BigInt = 100
; with
DiGraph ( NodeId, Name, Depth, ParentNodeId, Path, ForkIndex, DepthFirstOrder )
as (
  -- Start with the root node(s).
  select NodeId, Name, 0, Cast( NULL as Int ), Cast( Name as VarChar(1024) ),
    Cast( 0 as BigInt ), Cast( Right( '00' + Cast( 0 as VarChar(2) ), 2 ) as VarChar(128) )
    from @Nodes
    where not exists ( select 42 from @Relations where ChildNodeId = NodeId )
  union all
  -- Add children one generation at a time.
  select R.ChildNodeId, N.Name, DG.Depth + 1, R.ParentNodeId, Cast( DG.Path + ' > ' + N.Name as VarChar(1024) ),
    DG.ForkIndex + R.SiblingOrder * Power( @MaxSiblings, DG.Depth - 1 ),
    Cast( DG.DepthFirstOrder + Right( '00' + Cast( R.SiblingOrder as VarChar(2) ), 2 ) as VarChar(128) )
    from @Relations as R inner join
      DiGraph as DG on DG.NodeId = R.ParentNodeId inner join
      @Nodes as N on N.NodeId = R.ChildNodeId inner join
      @Nodes as Parent on Parent.NodeId = R.ParentNodeId
  ),

DiGraphSorted ( NodeId, Name, Depth, ParentNodeId, Path, ForkIndex, DepthFirstOrder, RowNumber )
as ( select *, Row_Number() over ( order by DepthFirstOrder ) as 'RowNumber' from DiGraph )

select ParentNodeId, NodeId, Depth, Path,
  ( select Count(*) from DiGraphSorted as L
    left outer join DiGraphSorted as R on R.RowNumber = L.RowNumber - 1 where
    R.RowNumber < DG.RowNumber and L.ForkIndex <> R.ForkIndex ) as 'ForkNumber' -- , '', *
  from DiGraphSorted as DG
  order by RowNumber
like image 101
HABO Avatar answered Nov 02 '22 15:11

HABO