Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using IDs from multiple tables in a single column

One of my co-workers created a schema similar to the following. This is a simplified schema including only the parts necessary to address this question.

The system rules are as follows:

  1. Departments can have 0 to many divisions.
  2. A division must belong to only one department.
  3. An article can be assigned to either a department, or a division of that department.

The schema is:

Department
---------- 
DepartmentID (PK) int NOT NULL
DepartmentName varchar(50) NOT NULL

Division
--------
DivisionID (PK) int NOT NULL
DepartmentID (FK) int NOT NULL
DivisonName varchar(50) NOT NULL

Article
-------
ArticleID (PK) int NOT NULL
UniqueID int NOT NULL
ArticleName varchar(50) NOT NULL

He defined the schema using an imaginary rule (for lack of a better term), that all DepartmentIDs would be between 1 and 100, and all DivisionIDs would be between 101 and 200. He states that when querying the Article table, you will know whether the UniqueID is from the Department table or the Division table based on what range it falls into.

I think this is a poor design and proposed the following alternative schema:

Department
----------
DepartmentID (PK) int NOT NULL
ParentDepartmentID (FK) int NULL /* Self-referencing foreign key.  Divisions have parent departments. */
DepartmentName varchar(50) NOT NULL

Article
-------
ArticleID (PK) int NOT NULL
DepartmentID (FK) int NOT NULL
ArticleName varchar(50) NOT NULL

I believe this is a properly normalized schema and properly enforces relationships and data integrity, while honoring the business rules outlined above.

My specific question is this:

I know using one column to contain values from two domains is poor design, and I can argue the benefits of the foreign key in the Article table. However, can someone provide a reference to a specific database design article / paper that I can use to backup my position. If I can point to something concrete, it will make it that much easier.

like image 699
DCNYAM Avatar asked Aug 03 '11 15:08

DCNYAM


People also ask

Can one column reference multiple tables?

In SQL can a single column in a table reference multiple tables - no this is not possible. A foreign key always references one target table (and one table only).

What are the three ways to work with multiple tables in the same query?

Three Main Ways to Combine ResultsJOIN – You can use joins to combine columns from one or more queries into one result. UNION – Use Unions and other set operators to combine rows from one or more queries into one result. Sub Queries – I sometimes call these nested queries.


1 Answers

Your coworker has implemented a design called Polymorphic Associations. That is, the "foreign key" refers to one of two different parent tables. Most people add another column parent_type or something like that so that you can tell which parent table a given row references. In your coworker's case, he has instead subdivided the range of id's. This is a fragile design, because you can't enforce it at the database level. If you ever insert a department number > 100, you can't know if your articles apply to a department or a division.

Whereas you have developed a design that looks like Single Table Inheritance, where you store multiple related types in a single table, so the primary keys are assured to remain unique, and the Articles can reference any instance of any of the related types.

Here's another alternative:

Think of object-oriented design. If you wanted to allow two different classes to have articles, you might make a common superclass or a common interface for the two classes. You can do the same thing in SQL:

ArticleProducer
---------------
ProducerID (PK) int NOT NULL

Department
----------
DepartmentID (PK) int NOT NULL, (FK)->ArticleProducer
DepartmentName varchar(50) NOT NULL

Division
--------
DivisionID (PK) int NOT NULL, (FK)->ArticleProducer
DepartmentID (FK) int NOT NULL
DivisonName varchar(50) NOT NULL

Article
-------
ArticleID (PK) int NOT NULL, (FK)->ArticleProducer
UniqueID int NOT NULL
ArticleName varchar(50) NOT NULL

So an article must be produced by a single ArticleProducer. Each Department or Division is an ArticleProducer.

See also Why can you not have a foreign key in a polymorphic association?

For more about Polymorphic Associations, see my presentation Practical Object-Oriented Models in SQL, or my book, SQL Antipatterns: Avoiding the Pitfalls of Database Programming.


Re comments from Erwin Smout:

You're right, trying to enforce that no more than one row from all the subtype tables is a bit tricky. MySQL does not support CHECK constraints in any storage engine, unfortunately. You can achieve something similar with lookup tables:

CREATE TABLE ArticleProducerTypes (ProducerType TINYINT UNSIGNED PRIMARY KEY);
INSERT INTO ArticleProducerTypes VALUES (1), (2);

CREATE TABLE ArticleProducer (
  ProducerID INT UNSIGNED NOT NULL PRIMARY KEY,
  ProducerType TINYINT UNSIGNED NOT NULL,
  UNIQUE KEY (ProducerID,ProducerType),
  FOREIGN KEY (ProducerType)
    REFERENCES ArticleProducerTypes(ProducerType)
) ENGINE=InnoDB;

CREATE TABLE DepartmentProducerType (ProducerType TINYINT UNSIGNED PRIMARY KEY);
INSERT INTO DepartmentProducerType VALUES (1);

CREATE TABLE Department (
  DepartmentID INT UNSIGNED NOT NULL PRIMARY KEY,
  DepartmentName VARCHAR(50) NOT NULL,
  ProducerType TINYINT UNSIGNED NOT NULL,
  FOREIGN KEY (DepartmentID, ProducerType) 
    REFERENCES ArticleProducer(ProducerID, ProducerType),
  FOREIGN KEY (ProducerType)
    REFERENCES DepartmentProducerType(ProducerType) -- restricted to '1'
) ENGINE=InnODB;

CREATE TABLE DivisionProducerType (ProducerType TINYINT UNSIGNED PRIMARY KEY);
INSERT INTO DivisionProducerType VALUES (2);

CREATE TABLE Division (
  DivisionID INT UNSIGNED NOT NULL PRIMARY KEY,
  ProducerType TINYINT UNSIGNED NOT NULL,
  DepartmentID INT UNSIGNED NOT NULL,
  FOREIGN KEY (DivisionID, ProducerType) 
    REFERENCES ArticleProducer(ProducerID, ProducerType),
  FOREIGN KEY (ProducerType)
    REFERENCES DivisionProducerType(ProducerType), -- restricted to '2'
  FOREIGN KEY (DepartmentID)
    REFERENCES Department(DepartmentID)  
) ENGINE=InnODB;

CREATE TABLE Article (
  ArticleID INT UNSIGNED NOT NULL PRIMARY KEY,
  ArticleName VARCHAR(50) NOT NULL,
  FOREIGN KEY (ArticleID)
    REFERENCES ArticleProducer(ProducerID)
);

Now each given row in ArticleProducer can be referenced by either Department or Division, but not both.

If we want to add a new producer type, we add one row to the ArticleProducerTypes lookup table, and create a pair of new tables for the new type. For example:

INSERT INTO ArticleProducerTypes VALUES (3);

CREATE TABLE PartnerProducerType (ProducerType TINYINT UNSIGNED PRIMARY KEY);
INSERT INTO PartnerProducerType VALUES (3);

CREATE TABLE Partner (
  PartnerID INT UNSIGNED NOT NULL PRIMARY KEY,
  ProducerType TINYINT UNSIGNED NOT NULL,
  FOREIGN KEY (PartnerID, ProducerType) 
    REFERENCES ArticleProducer(ProducerID, ProducerType),
  FOREIGN KEY (ProducerType)
    REFERENCES PartnerProducerType(ProducerType) -- restricted to '3'
) ENGINE=InnODB;

But we still have the possibility that neither contains a reference to that given row in ArticleProducer; i.e. we can't make a constraint that forces a row to be created in one of the dependent tables. I don't have a solution for that.

like image 133
Bill Karwin Avatar answered Sep 22 '22 13:09

Bill Karwin