I have a table that defines a hierarchy:
Create Table [example] (
id Integer Not Null Primary Key,
parentID Integer Null,
largeData1 nVarChar(max) Null,
largeData2 nVarChar(max) Null);
-- largeData3...n also exist
Insert Into [example] (id, parentID, largeData1, largeData2)
Select 1, null, 'blah blah blah', null Union
Select 2, 1, null, null Union
Select 3, 1, 'foo bar foobar', null Union
Select 4, 3, null, 'lorem ipsum' Union
Select 5, 4, null, null;
Hierarchy diagram for this data:
I want to write a query that will return a single row for any given [id] value. The row should contain that row's [id] and [parentID] information. It should also contain the [largeData1...n] fields. However, if a largeData field is null, then it should traverse up the hierarchy until a non-null value for that field is encountered. It should, in short, function like the coalesce function, except across a hierarchy of rows instead of a set of columns.
Example:
Where [id] = 1:
id: 1
parentID: null
largeData1: blah blah blah
largeData2: null
Where [id] = 2
id: 1
parentID: 1
largeData1: blah blah blah
largeData2: null
Where [id] = 3
id: 3
parentID: 1
largeData1: foo bar foobar
largeData2: null
Where [id] = 4
id: 4
parentID: 3
largeData1: foo bar foobar
largeData2: lorem ipsum
Where [id] = 5
id: 5
parentID: 4
largeData1: foo bar foobar
largeData2: lorem ipsum
So far, I have this:
Declare @id Integer; Set @id = 5;
With heirarchy
(id, parentID, largeData1, largeData2, [level])
As (
Select id, parentID, largeData1,
largeData2, 1 As [level]
From example
Where id = @id
Union All
Select parent.id, parent.parentID,
parent.largeData1,
parent.largeData2,
child.[level] + 1 As [level]
From example As parent
Inner Join heirarchy As child
On parent.id = child.parentID)
Select id, parentID,
(Select top 1 largeData1
From heirarchy
Where largeData1 Is Not Null
Order By [level] Asc) As largeData1,
(Select top 1 largeData2
From heirarchy
Where largeData2 Is Not Null
Order By [level] Asc) As largeData2
From example
Where [id] = @id;
This returns the results that I am looking for. However, according to the query plan, it is making a separate pass through the hierarchy for each largeData field that I pull back.
How can I make this more efficient?
This is obviously a simplified version of a more complex problem. The final query will return data in XML format, so any solutions involving the FOR XML clause are perfectly fine.
I can create a CLR aggregate function for this, if doing so would help. I have not yet explored that route.
I came up with this:
DECLARE @Id int
SET @Id = 5
;WITH cte (Id, ParentId, SaveParentId, LargeData1, LargeData2)
as (-- The "anchor", your target Id
select
ex.Id
,ex.ParentId
,ex.ParentId SaveParentId -- Not changed throughout the CTE
,ex.LargeData1
,ex.LargeData2
from Example ex
where ex.Id = @Id
union all select
cte.Id
,ex.ParentId
,cte.SaveParentId -- Not changed throughout the CTE
-- These next are only "reset" if they are null and a not-null
-- value was found at this level
,isnull(ex.LargeData1, cte.LargeData2)
,isnull(ex.LargeData2, cte.LargeData2)
from Example ex
inner join cte
on cte.ParentId = ex.Id)
select
Id
,SaveParentId ParentId
,max(LargeData1) LargeData1
,max(LargeData2) LargeData2
from cte
group by Id, SaveParentId
Basically, start at your target node and walk up the tree, replacing your null columns with not-null values if and when they are found.
(Sorry, but I don't do XML on weekends.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With