Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Inheritance in Database Design

I am designing a new laboratory database with MANY types of my main entities.

The table for each entity will hold fields common to ALL types of that entity (entity_id, created_on, created_by, etc). I will then use concrete inheritance (separate table for each unique set of attributes) to store all remaining fields.

I believe that this is the best design for the standard types of data which come through the laboratory daily. However, we often have a special samples which often are accompanied by specific values the originator wants stored.

Question: How should I model special (non-standard) types of entities?

Option 1: Use entity-value for special fields
One table (entity_id, attribute_name, numerical_value) would hold all data for any special entity.
+ Fewer tables.
- Cannot enforce requiring a particular attribute.
- Must convert (pivot) rows to columns which is inefficient.

Option 2: Strict concrete inheritance.
Create separate table for each separate special case.
+ Follows in accordance with all other rules
- Overhead of many tables with only a few rows.

Option 3: Concrete inheritance with special tables under a different user.
Put all special tables under a different user.
+ Keeps all special and standard tables separate.
+ Easier to search for common standard table in a list without searching through all special tables.
- Overhead of many tables with only a few rows.

like image 423
Steven Avatar asked Aug 14 '09 19:08

Steven


1 Answers

Actually the design you described (common table plus subtype-specific tables) is called Class Table Inheritance.

Concrete Table Inheritance would have all the common attributes duplicated in the subtype tables, and you'd have no supertype table as you do now.

I'm strongly against EAV. I consider it an SQL antipattern. It may seem like an elegant solution because it requires fewer tables, but you're setting yourself up for a lot of headache later. You identified a couple of the disadvantages, but there are many others. IMHO, EAV is used appropriately only if you absolutely must not create a new table when you introduce a new subtype, or if you have an unbounded number of subtypes (e.g. users can define new attributes ad hoc).

You have many subtypes, but still a finite number of them, so if I were doing this project I'd stick with Class Table Inheritance. You may have few rows of each subtype, but at least you have some assurance that all rows in each subtype have the same columns, you can use NOT NULL if you need to, you can use SQL data types, you can use referential integrity constraints, etc. From a relational perspective, it's a better design than EAV.

One more option that you didn't mention is called Serialized LOB. That is, add a BLOB column for a semi-structured collection of custom attributes. Store XML, YAML, JSON, or your own DSL in that column. You won't be able to parse individual attributes out of that BLOB easily with SQL, you'll have to fetch the whole BLOB back into your application and extract individual attributes in code. So in some ways it's less convenient. But if that satisfies your usage of the data, then there's nothing wrong with that.

like image 121
Bill Karwin Avatar answered Oct 14 '22 14:10

Bill Karwin