Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL - How to find a value in a tree level data structure

Tags:

sql

sql-server

I have two SQL Server tables:

  • Invoice (invoice)
  • Invoice Relations (invoice_relation)

invoice table stores all invoice records with a transaction folio.

enter image description here

invoice_relation table stores any relation between invoices.

enter image description here

This is an example of how invoices can be related between each other:

enter image description here

So the goal is to find the "folio" under invoice table given an invoicenumber and a folio but the folio sometimes won't be the folio that the invoice has, so I need to do a search on all the tree relation in order to find if any invoice match the invoice number but also the folio is part of the relation.

For example I have to find the folio and match invoice number of:

  • Folio: 1003
  • Invoice Number: A1122

In my query I would need to first find by folio because it's my invoice table primary key. Then, will try to match A1122 with D1122 that won't match, so then I have to search all the tree structure to find if there is a A1122. The result will be that the invoice A1122 was found in folio 1000.

Any clue on how to do this?

Here is a script of how to create the above example tables with data:

CREATE TABLE [dbo].[invoice](
    [folio] [int] NOT NULL,
    [invoicenumber] [nvarchar](20) NOT NULL,
    [isactive] [bit] NOT NULL,
 CONSTRAINT [PK_invoice] PRIMARY KEY CLUSTERED 
(
    [folio] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

GO


CREATE TABLE [dbo].[invoice_relation](
    [relationid] [int] NOT NULL,
    [invoice] [nvarchar](20) NOT NULL,
    [parentinvoice] [nvarchar](20) NOT NULL,
 CONSTRAINT [PK_invoice_relation_1] PRIMARY KEY CLUSTERED 
(
    [relationid] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

GO
INSERT [dbo].[invoice] ([folio], [invoicenumber], [isactive]) VALUES (1000, N'A1122', 1)
GO
INSERT [dbo].[invoice] ([folio], [invoicenumber], [isactive]) VALUES (1001, N'B1122', 1)
GO
INSERT [dbo].[invoice] ([folio], [invoicenumber], [isactive]) VALUES (1002, N'C1122', 1)
GO
INSERT [dbo].[invoice] ([folio], [invoicenumber], [isactive]) VALUES (1003, N'D1122', 1)
GO
INSERT [dbo].[invoice] ([folio], [invoicenumber], [isactive]) VALUES (1004, N'F1122', 1)
GO
INSERT [dbo].[invoice] ([folio], [invoicenumber], [isactive]) VALUES (1005, N'G1122', 1)
GO
INSERT [dbo].[invoice_relation] ([relationid], [invoice], [parentinvoice]) VALUES (1, N'A1122', N'B1122')
GO
INSERT [dbo].[invoice_relation] ([relationid], [invoice], [parentinvoice]) VALUES (2, N'C1122', N'A1122')
GO
INSERT [dbo].[invoice_relation] ([relationid], [invoice], [parentinvoice]) VALUES (3, N'D1122', N'A1122')
GO
INSERT [dbo].[invoice_relation] ([relationid], [invoice], [parentinvoice]) VALUES (4, N'F1122', N'B1122')
GO
INSERT [dbo].[invoice_relation] ([relationid], [invoice], [parentinvoice]) VALUES (5, N'G1122', N'F1122')
GO
like image 806
VAAA Avatar asked Aug 18 '16 16:08

VAAA


4 Answers

I am still not sure what you really want, I had written something similar to JamieD77 which is to find top parent and then walk back down tree but then you get children and granchildren that are not directly related to A1122.....

Here is a way to walk up and down the tree and return all children and parents directly related to an invoicenumber

DECLARE @InvoiceNumber NVARCHAR(20) = 'A1122'
DECLARE @Folio INT = 1003

;WITH cteFindParents AS (
    SELECT
       i.folio
       ,i.invoicenumber
       ,CAST(NULL AS NVARCHAR(20)) as ChildInvoiceNumber
       ,CAST(NULL AS NVARCHAR(20)) as ParentInvoiceNumber
       ,0 as Level
    FROM
       dbo.invoice i
    WHERE
       i.invoicenumber = @InvoiceNumber

    UNION ALL

    SELECT
       i.folio
       ,i.invoicenumber
       ,c.invoicenumber as ChildInvoiceNumber
       ,i.invoicenumber as ParentInvoiceNumber
       ,c.Level - 1 as Level
    FROM
       cteFindParents c
       INNER JOIN dbo.invoice_relation r
       ON c.invoicenumber = r.invoice
       INNER JOIN dbo.invoice i
       ON r.parentinvoice = i.invoicenumber
)

, cteFindChildren as (
    SELECT *
    FROM
       cteFindParents

    UNION ALL

    SELECT
       i.folio
       ,i.invoicenumber
       ,i.invoicenumber AS ChildInvoiceNumber
       ,c.invoicenumber AS ParentInvoiceNumber
       ,Level + 1 as Level
    FROM
       cteFindChildren c
       INNER JOIN dbo.invoice_relation r
       ON c.invoicenumber = r.parentinvoice
       INNER JOIN dbo.invoice i
       ON r.invoice = i.invoicenumber
    WHERE
       c.Level = 0
)

SELECT *
FROM
    cteFindChildren

But Depending on what exactly you are looking for you may actually get a couple of cousins that are not desired.....

--------------Here was a method to find top parent and get the whole tree

DECLARE @InvoiceNumber NVARCHAR(20) = 'A1122'
DECLARE @Folio INT = 1003

;WITH cteFindParents AS (
    SELECT
       i.folio
       ,i.invoicenumber
       ,CAST(NULL AS NVARCHAR(20)) as ChildInvoiceNumber
       ,0 as Level
    FROM
       dbo.invoice i
    WHERE
       i.invoicenumber = @InvoiceNumber

    UNION ALL

    SELECT
       i.folio
       ,i.invoicenumber
       ,c.invoicenumber as ChildInvoiceNumber
       ,c.Level + 1 as Level
    FROM
       cteFindParents c
       INNER JOIN dbo.invoice_relation r
       ON c.invoicenumber = r.invoice
       INNER JOIN dbo.invoice i
       ON r.parentinvoice = i.invoicenumber

)

, cteGetTopParent AS  (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY LEVEL DESC) as RowNum
    FROM
       cteFindParents
)

, cteGetWholeTree AS (
    SELECT
       p.folio
       ,p.invoicenumber
       ,p.invoicenumber as TopParent
       ,p.invoicenumber as Parent
       ,CAST(p.invoicenumber AS NVARCHAR(1000)) as Hierarchy
       ,0 as Level
    FROM
       cteGetTopParent p
    WHERE
       RowNum = 1

    UNION ALL

    SELECT
       i.folio
       ,i.invoicenumber
       ,c.TopParent
       ,c.invoicenumber AS Parent
       ,CAST(c.TopParent + '|' + (CASE WHEN Level > 0 THEN c.invoicenumber + '|' ELSE '' END) + i.invoicenumber  AS NVARCHAR(1000)) as Hierarchy
       ,Level + 1 as Level
    FROM
       cteGetWholeTree c
       INNER JOIN dbo.invoice_relation r
       ON c.invoicenumber = r.parentinvoice
       INNER JOIN dbo.invoice i
       ON r.invoice = i.invoicenumber
)

SELECT *
FROM
    cteGetWholeTree
like image 156
Matt Avatar answered Oct 26 '22 00:10

Matt


Your model is broken to begin with. parentinvoice should be in the invoice table. It's a recursive database model....so make the table schema recursive. Have a nullable foreign key in a column that references it's own table. Any time that field (the parent invoice field) is null indicates that it is the primary invoice. Any row having a parent is a piece of invoice.

When you want to find a value in a tree level structure you wrap your initial sql query into a 'SELECT(.....)' statement (creating your own custom selectable table) that filters out what you want. Let me know if you have any questions!

like image 35
SwampDev Avatar answered Oct 26 '22 02:10

SwampDev


I was a little unclear as to your actual requirements, so I figured a Table Valued Function may be appropriate here. I added a few optional items, and if unwanted, they are easy enough to remove (i.e. TITLE, Nesting, TopInvoice, TopFolio). Also, you may notice the Range Keys (R1/R2). These serve many functions: Presentation Sequence, Selection Criteria, Parent/Leaf Indicators, and perhaps most importantly Non-Recursive Aggregation.

To Return the Entire Hierarchy

Select * from [dbo].[udf_SomeFunction](NULL,NULL)     

enter image description here


To Return an Invoice and ALL of its descendants

Select * from [dbo].[udf_SomeFunction]('A1122',NULL) 

enter image description here


To Return the PATH of a Folio

Select * from [dbo].[udf_SomeFunction](NULL,'1003') 

enter image description here


To Return a Folio Limited to an Invoice

Select * from [dbo].[udf_SomeFunction]('A1122','1003')

enter image description here

The following code requires SQL 2012+

CREATE FUNCTION [dbo].[udf_SomeFunction](@Invoice nvarchar(25),@Folio nvarchar(25))
Returns Table
As  
Return (
with cteBld as (
      Select Seq  = cast(1000+Row_Number() over (Order By Invoice) as nvarchar(500)),I.Invoice,I.ParentInvoice,Lvl=1,Title = I.Invoice,F.Folio
        From (
              Select Distinct 
                     Invoice=ParentInvoice
                    ,ParentInvoice=cast(NULL as nvarchar(20)) 
              From   [Invoice_Relation] 
              Where  @Invoice is NULL and ParentInvoice Not In (Select Invoice from [Invoice_Relation])
              Union  All
              Select Invoice
                    ,ParentInvoice 
              From   [Invoice_Relation] 
              Where  Invoice=@Invoice
             ) I
        Join Invoice F on I.Invoice=F.InvoiceNumber
      Union  All
      Select Seq  = cast(concat(A.Seq,'.',1000+Row_Number() over (Order by I.Invoice)) as nvarchar(500))
            ,I.Invoice
            ,I.ParentInvoice
            ,A.Lvl+1
            ,I.Invoice,F.folio
      From   [Invoice_Relation] I
      Join   cteBld  A on I.ParentInvoice = A.Invoice 
      Join   Invoice F on I.Invoice=F.InvoiceNumber )
     ,cteR1  as (Select Seq,Invoice,Folio,R1=Row_Number() over (Order By Seq) From cteBld)
     ,cteR2  as (Select A.Seq,A.Invoice,R2=Max(B.R1) From cteR1 A Join cteR1 B on (B.Seq like A.Seq+'%') Group By A.Seq,A.Invoice )

Select Top 100 Percent 
       B.R1
      ,C.R2
      ,A.Invoice
      ,A.ParentInvoice
      ,A.Lvl
      ,Title = Replicate('|-----',A.Lvl-1)+A.Title    -- Optional: Added for Readability
      ,A.Folio
      ,TopInvoice  = First_Value(A.Invoice) over (Order By R1) 
      ,TopFolio    = First_Value(A.Folio)   over (Order By R1) 
 From  cteBld A 
 Join  cteR1  B on A.Invoice=B.Invoice 
 Join  cteR2  C on A.Invoice=C.Invoice 
 Where (@Folio is NULL)
    or (@Folio is Not NULL and (Select R1 from cteR1 Where Folio=@Folio) between R1 and R2)
 Order By R1
)

Final Thoughts:

This certainly may be more than what you were looking, and there is good chance that I COMPLETELY misunderstood your requirements. That said, being a TVF you can expand with additional WHERE and/or ORDER clauses or even incorporate into a CROSS APPLY.

like image 31
John Cappelletti Avatar answered Oct 26 '22 00:10

John Cappelletti


This uses an approach of using a hierarchyid, first generating hierarchyid for every row, then selecting the row where folio is 1003, then finding all ancestors who have an invoicenumber of 'A1122'. It's not very efficient but may give you some different ideas:

;WITH
Allfolios
AS
(
    Select i.folio, i.InvoiceNumber,
          hierarchyid::Parse('/' + 
               CAST(ROW_NUMBER() 
                        OVER (ORDER BY InvoiceNumber) AS VARCHAR(30)
                   ) + '/') AS hierarchy, 1 as level
    from invoice i
    WHERE NOT EXISTS 
        (SELECT * FROM invoice_relation ir WHERE ir.invoice = i. invoicenumber)
    UNION ALL
    SELECT i.folio, i.invoiceNumber,
          hierarchyid::Parse(CAST(a.hierarchy as VARCHAR(30)) + 
                             CAST(ROW_NUMBER() 
                                   OVER (ORDER BY a.InvoiceNumber) 
                               AS VARCHAR(30)) + '/') AS hierarchy, 
          level + 1
    FROM Allfolios A
    INNER JOIN invoice_relation ir
        on a.InvoiceNumber = ir.ParentInvoice
    INNER JOIN invoice i
        on ir.Invoice = i.invoicenumber
),
Ancestors
AS
(
    SELECT folio, invoiceNumber, hierarchy, hierarchy.GetAncestor(1) as AncestorId
    from Allfolios
    WHERE folio = 1003
    UNION ALL
    SELECT af.folio, af.invoiceNumber, af.hierarchy, af.hierarchy.GetAncestor(1)
      FROM Allfolios AF
      INNER JOIN 
            Ancestors a ON Af.hierarchy= a.AncestorId
)
SELECT *
FROM Ancestors
WHERE InvoiceNumber = 'A1122'

Edited for the case highlighted by @jj32 where you wish to find the root element in the hierarchy which folio 1003 is in, then find any descendant of that root which has an invoice number of 'A1122'. See below:

;WITH
Allfolios -- Convert all rows to a hierarchy
AS
(
    Select i.folio, i.InvoiceNumber,
          hierarchyid::Parse('/' + 
               CAST(ROW_NUMBER() 
                        OVER (ORDER BY InvoiceNumber) AS VARCHAR(30)
                   ) + '/') AS hierarchy, 1 as level
    from invoice i
    WHERE NOT EXISTS 
        (SELECT * FROM invoice_relation ir WHERE ir.invoice = i. invoicenumber)
    UNION ALL
    SELECT i.folio, i.invoiceNumber,
          hierarchyid::Parse(CAST(a.hierarchy as VARCHAR(30)) + 
                             CAST(ROW_NUMBER() 
                                   OVER (ORDER BY a.InvoiceNumber) 
                               AS VARCHAR(30)) + '/') AS hierarchy, 
          level + 1
    FROM Allfolios A
    INNER JOIN invoice_relation ir
        on a.InvoiceNumber = ir.ParentInvoice
    INNER JOIN invoice i
        on ir.Invoice = i.invoicenumber
),
Root -- Find Root
AS
(
    SELECT *
    FROM AllFolios AF
    WHERE Level = 1 AND 
    (SELECT hierarchy.IsDescendantOf(AF.hierarchy)  from AllFolios AF2 WHERE folio = 1003) = 1
)
-- Find all descendants of the root element which have an invoicenumber = 'A1122'
SELECT *
FROM ALLFolios
WHERE hierarchy.IsDescendantOf((SELECT TOP 1 hierarchy FROM Root)) = 1 AND
invoicenumber = 'A1122'
like image 1
Steve Ford Avatar answered Oct 26 '22 00:10

Steve Ford