Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Anyone used SQl Server 2008 HierarchialID type to store genealogy data

I have a genealogical database (about sheep actually), that is used by breeders to research genetic information. In each record I store fatherid and motherid. In a seperate table I store complete 'roll up' information so that I can quickly tell the complete family tree of any animal without recursing thru the entire database...

Recently discovered the hierarchicalID type built into SQL server 2008, on the surface it sounds promising, but I and am wondering if anyone has used it enough to know whether or not it would be appropriate in my type of app(i.e. two parents, multiple kids)? All the samples I have found/read so far deal with manager/employee type relationships where a given boss can have multiple employees, and each employee can have a single boss.

The needs of my app are similar, but not quite the same.

I am sure I will dig into this new technology anyway, but it would be nice to shortcut my research if someone already knew that it was not designed in such a fashion that it would allow me to make use of it.

I am also curious what kind of performance people are seeing using this new data type versus other methods that do the same thing.

like image 874
E.J. Brennan Avatar asked Feb 23 '09 14:02

E.J. Brennan


3 Answers

Assuming each sheep has one male parent and one female parent, and that no sheep can be its own parent (leading to an Ovine Temporal Paradox), then what about using two HierarchyIDs?

CREATE TABLE dbo.Sheep(
    MotherHID hierarchyid NOT NULL,
    FatherHID hierarchyid NOT NULL,
    Name int NOT NULL
)
GO
ALTER TABLE dbo.Sheep 
ADD CONSTRAINT PK_Sheep PRIMARY KEY CLUSTERED (
    MotherHID,
    FatherHID
)
GO

By making them a joint PK, you'd be uniquely identifying each sheep as the product of its maternal hierarchy and it's paternal hierarchy.

There may be some inherent problem lurking here, so proceed cautiously with a couple simple prototypes - but initially it seems like it would work for you.

like image 52
msulis Avatar answered Nov 08 '22 01:11

msulis


I can't see how it would work; in a regular hierarchy, there is a single chain to the root, so it can store the path (which is what the binary is) to each node. However, with multiple parents, this isn't possible: even if you split matriarchy and partiarchy, you still have 1 mother, 2 grandmothers, 4 great-grand-mothers, etc (not even getting into some of the more "interesting" scanerios possible, especially with livestock). There is no single logical path to encode, so no: I don't think that this can work in your case.

I'm happy to be corrected, though.

like image 24
Marc Gravell Avatar answered Nov 08 '22 01:11

Marc Gravell


Using two separate HierarchyID to indicate father and mother would work well.

However, you definitely would NOT want to use those as a unique indicator of the row, since it's a 2-to-many situation. (Two sheep can have multiple children.)

I don't see anything inherently wrong with using HierarchyId for ancestry--for Sheep at least. For people, the relationships are much more complicated than "this person begat that person", so obviously the use would be limited to breeding.

like image 2
Richard Avatar answered Nov 08 '22 01:11

Richard