Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generic Database table design

Just trying to figure out the best way to design my table for the following scenario:

I have several areas in my system (documents, projects, groups and clients) and each of these can have comments logged against them.

My question is should I have one table like this:

CommentID
DocumentID
ProjectID
GroupID
ClientID
etc

Where only one of the ids will have data and the rest will be NULL or should I have a separate CommentType table and have my comments table like this:

CommentID
CommentTypeID
ResourceID (this being the id of the project/doc/client)
etc

My thoughts are that option 2 would be more efficient from an indexing point of view. Is this correct?

like image 478
Gaz Avatar asked Jun 16 '10 16:06

Gaz


People also ask

What is a generic database?

A generic database is a table in which the field names are stored as field values in one column, while the field values are stored in a second. Generic databases are usually used for attributes of different objects.

What is database design table?

The design process Gather all of the types of information you might want to record in the database, such as product name and order number. Divide the information into tables. Divide your information items into major entities or subjects, such as Products or Orders. Each subject then becomes a table.

What is database design with example?

Database design is the organization of data according to a database model. The designer determines what data must be stored and how the data elements interrelate. With this information, they can begin to fit the data to the database model. Database management system manages the data accordingly.

What is database design diagram?

Database Design Articles A database diagram is the very foundation of a database design and development effort. It represents the basic structure of a database; how information is stored, categorized and managed within it.


2 Answers

Option 2 is not a good solution for a relational database. It's called polymorphic associations (as mentioned by @Daniel Vassallo) and it breaks the fundamental definition of a relation.

For example, suppose you have a ResourceId of 1234 on two different rows. Do these represent the same resource? It depends on whether the CommentTypeId is the same on these two rows. This violates the concept of a type in a relation. See SQL and Relational Theory by C. J. Date for more details.

Another clue that it's a broken design is that you can't declare a foreign key constraint for ResourceId, because it could point to any of several tables. If you try to enforce referential integrity using triggers or something, you find yourself rewriting the trigger every time you add a new type of commentable resource.

I would solve this with the solution that @mdma briefly mentions (but then ignores):

CREATE TABLE Commentable (
  ResourceId INT NOT NULL IDENTITY,
  ResourceType INT NOT NULL,
  PRIMARY KEY (ResourceId, ResourceType)
);

CREATE TABLE Documents (
  ResourceId INT NOT NULL,
  ResourceType INT NOT NULL CHECK (ResourceType = 1),
  FOREIGN KEY (ResourceId, ResourceType) REFERENCES Commentable
);

CREATE TABLE Projects (
  ResourceId INT NOT NULL,
  ResourceType INT NOT NULL CHECK (ResourceType = 2),
  FOREIGN KEY (ResourceId, ResourceType) REFERENCES Commentable
);

Now each resource type has its own table, but the serial primary key is allocated uniquely by Commentable. A given primary key value can be used only by one resource type.

CREATE TABLE Comments (
  CommentId INT IDENTITY PRIMARY KEY,
  ResourceId INT NOT NULL,
  ResourceType INT NOT NULL,
  FOREIGN KEY (ResourceId, ResourceType) REFERENCES Commentable
);

Now Comments reference Commentable resources, with referential integrity enforced. A given comment can reference only one resource type. There's no possibility of anomalies or conflicting resource ids.

I cover more about polymorphic associations in my presentation Practical Object-Oriented Models in SQL and my book SQL Antipatterns.

like image 55
Bill Karwin Avatar answered Sep 23 '22 20:09

Bill Karwin


Read up on database normalization.

Nulls in the way you describe would be a big indication that the database isn't designed properly.

You need to split up all your tables so that the data held in them is fully normalized, this will save you a lot of time further down the line guaranteed, and it's a lot better practice to get into the habit of.

like image 22
Tom Gullen Avatar answered Sep 20 '22 20:09

Tom Gullen