Consider the following rows in a database:
Id | Parent
__________________
1 null
2 1
3 2
4 3
5 null
6 5
Each Id
that has a null
Parent
is the "Owner"/"Super Parent".
What would be the best approach, performance wise, to collect the parents and their children ? Should I use LINQ or Stored Procedures ?
I want the end result to be IEnumerable<IEnumerable<int>>
.
In SQL, you could query using a CTE. For example, to retrieve a list of nodes with their parent and the highest parent in their tree:
declare @t table (id int, parent int)
insert @t (id, parent) values (1, null), (2,1), (3,2), (4,3), (5,null), (6,5)
; with cte as (
select id, parent, id as head
from @t
where parent is null
union all
select child.id, child.parent, parent.head
from @t child
join cte parent
on parent.id = child.parent
)
select *
from cte
This gives:
id parent head
1 NULL 1
2 1 1
3 2 1
4 3 1
5 NULL 5
6 5 5
Note that I changed your example data so row 2 is no longer a child of itself, but a child of row 1.
You may also use a pure SQL solution; here is a sample for SQL Server. It wouldn't be difficult to re-write it for a different DB manager:
/* Create table */
CREATE TABLE dbo.Nodes (ID int NOT NULL PRIMARY KEY, Parent int)
/* Insert sample data */
INSERT INTO Nodes VALUES (1,NULL)
INSERT INTO Nodes VALUES (2,1)
INSERT INTO Nodes VALUES (3,2)
INSERT INTO Nodes VALUES (4,3)
INSERT INTO Nodes VALUES (5,NULL)
INSERT INTO Nodes VALUES (6,5)
/* Create recursive function */
CREATE function dbo.fn_Root(@ID int) returns int
AS BEGIN
DECLARE @R int
SELECT @R = CASE WHEN Parent IS NULL THEN ID
ELSE dbo.fn_Root(Parent)
END
FROM Nodes
WHERE id = @id
RETURN @R
END
/* Query the table */
SELECT ID, Parent, dbo.fn_Root(ID) AS Root
FROM Nodes
/* Also, in SQL Server you can create a calculated column */
ALTER TABLE Nodes ADD Root AS dbo.fn_Root(id)
This is the basic version. But this one would fail if your data had closed loops (not a tree structure). To prevent code from entering in an endless loop, the function could be improved like this:
CREATE function dbo.fn_Root(@ID int, @Initial int) returns int
AS BEGIN
DECLARE @R int
DECLARE @Parent int
SELECT @Parent = Parent FROM Nodes WHERE ID = @ID
IF @Parent IS NULL
SELECT @R = @ID /* No parent, the root is the node itself */
ELSE
IF @Parent = @Initial
/* We have returned to initial node: endless loop. We return NULL to indicate no root exists */
SELECT @R = NULL
ELSE
/* The root will be the root of the parent node */
SELECT @R = dbo.fn_Root(@Parent,@Initial)
RETURN @R
END
/* Query the table */
SELECT ID, Parent, dbo.fn_Root(ID,ID) AS Root FROM Nodes
With this modification, if function returns NULL, it indicates that the node is part of a loop, and so it has no root node.
If the table is not too big your best bet would be to return the entire table by doing this db.Categories
Once the entire categories table is fetched into Entity Framework, EF will use relationship span to fix the objectgraph so when you do category.SubCategories you would get all the children. The advantage of this is, you sql wouldn't be complex because it would be basically select * from categories. Much of the hard work would be done by EF in fixing the object graph so that all the children align correctly with their parent.
You can also use what someone else mentioned about using common table expression.
I cover two such concepts in my book.
5-11 Using Relationship Span 5-2 Loading a Complete Object Graph(CTE)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With