What are the benefits and liabilities of including grandparent+ foriegn keys in a table.
For example, if my object model looks like below. (Greatly simplified, so its not eligable for hiearchical recursive table.)
a {aId, bCollection, ...}
b {bId, cCollection, ...}
c {cId, dCollection, ...}
d {dId}
The two data model options that come to mind are:
option 1:
a {pkA, ...}
b {pkB, fkA, ...}
c {pkC, fkB, ...}
d {pkD, fkC, ...}
option 2:
a {pkA, ...}
b {pkB, fkA, ...}
c {pkC, fkB, fkA, ...}
d {pkD, fkC, fkB, fkA, ...}
Option 1 is more normalized and inserts and updates will be easier but I can see queries getting quite complicated especially with many to many relationships and / or compound keys.
Option 2 complicates the inserts and updates but extracting reports will be easier. Furthermore the database will be larger, but I'm not really concerned with that as it's pretty small anyway.
But those are fairly insignificant concerns compared to the issues that one would have with an ORM like entity framework. I'm leaning toward option 2 because I'd like access the grandchildren directly from the parent like so:
Class A { id, bCollection, cCollection, dCollection, ... }
Class B { id, cCollection, dCollection, ... }
Class C { id, dCollection, ... }
Class D { id, ...}
Does entity framework 4.0 handle this situation gracefully? What are the pros and cons of the two options? Is there another alternative I should consider?
Or more simply, how the heck does one google this kinda question?!?
One other note: Like many of you, must gut and head lean heavily toward option A, but I know I've read an msdn article that goes into great detail about why option B is better. Unfortunately, I can't find it. :(
thanks in advance for your thoughts.
The table with the foreign key is called the child table, and the table with the primary key is called the referenced or parent table.
Foreign key references are stored within a child table and links up to a primary key in a separate table. The column acting as a foreign key must have a corresponding value in its linked table. This creates referential integrity.
Parent and child tables must use the same storage engine, and they cannot be defined as temporary tables. Creating a foreign key constraint requires the REFERENCES privilege on the parent table. Corresponding columns in the foreign key and the referenced key must have similar data types.
Note that foreign keys are not mandatory, and a table may have no foreign keys. Conversely, every column in a table may have a foreign key constraint.
I would avoid option B like the plague. Neither option is substantially more complex to query than the other, but the second option is much more difficult to maintain and is crying out for normalization into the option A.
For querying option A, all you're talking about are adding simple joins. Since this is not a recursive relationship, you must already know up-front how many levels "deep" the query can go, so you don't have to worry about it being brittle or only working for a subset of potential cases.
Compare selecting your deepest case where you're looking for the top-level parent of a deeply-nested node:
Option A:
select
a.id
from d
join c on c.id = d.c_id
join b on b.id = c.b_id
join a on a.id = b.a_id
where d.id = @id
Option B:
select a_id from d where id = @id
Is Option A more complicated? Yes, but it shouldn't pose a challenge for anyone to figure out what's going on.
As for the compatibility with an EF4, this is not a problem. You can navigate up the chain of parents in a linear fashion to get the grandparent you want. You can either do this in code or in a query; either one will work fine:
In code:
var entity = context.D.Where(d => d.Id == 4).First();
var grandParent = entity.C.B.A;
In a query (using LINQ and joins):
var grandparent = (from d in context.D
join c in context.C on d.CId == c.Id
join b in context.B on c.BId == b.Id
join a in context.A on b.AId == a.Id
where d.Id == id
select a // or a.Id).First();
In a query (using navigation properties):
var grandparent = (from d in context.D
where d.Id == id
select d.C.B.A // or d.C.B.A.Id).First();
If your database is primarily OLTP, I would go with option A. If your reports become an issue, you can create a view or denormalized table to represent option B.
If your database is primarily OLAP, I would go with option B.
I'd say that if the DB has good support for recursive relationships (ie, Oracle is great at this, and several other have specific syntax for these queries), then use recursive (your option A). Otherwise you are going to be stuck several days debugging your code making sure you are updating all the records you need to update when the relationship changes and you need to cascade by hand your changes.
And for Google, use the term "recursive tables", or "recursive queries".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With