I've run across some tables that are using a number of lookup tables in order to reference their specific values. This makes the database design extremely complicated for structures that could have only been 5 tables expanding into 20+ tables because of lookups. I've looked around but have seen no discussion on if using a lookup table is a good practice.
On one had, lookup tables let you manage your domains, allows adding new values to the domain, and shows devs/dbas that there is a domain to be followed. But at the same time, lookup tables clutter your database, add unnecessary business logic into the core data tables, and make obtaining information from the tables more complex. It also requires domain lookups to be executed through a database fetch which adds additional runtime to the application using it.
Without lookup tables, the data is easily presentable, creates less clutter in the database, and makes the database design easier to see. However, it makes anyone looking at the data unable to determine if the values in a specific column are tied to a domain, and, if there is a domain, makes managing the domain done through the application.
My question is: With modern standards, are lookup tables a good practice to be implementing?
If you had a table like:
CREATE TABLE Bugs (
bug_id INT PRIMARY KEY,
bug_status VARCHAR(20) DEFAULT 'NEW',
description TEXT
...
);
And you wanted to know what are all the valid values of bug_status
, you could get the set of statuses currently in use:
SELECT DISTINCT bug_status FROM Bugs;
But how do you know there's a sample of every status supported by the domain present in the table at the time you query it?
If you use this method to get a list of bug statuses allowed, for example to populate a drop-down list for your user interface, then you could never include a status that wasn't already in use by at least one bug. And that means the UI would not allow anyone to change the bug's status to some value for the first time.
This should be an example of why lookup tables are still a legitimate entry in your database design. It has nothing to do with "modern standards." I don't know why that would change anything.
Note that if you don't like doin JOINs to lookup tables to get the string value, you don't have to. If you prefer to put the string value directly in the main table, that's totally allowed by SQL.
CREATE TABLE Bugs (
bug_id INT PRIMARY KEY,
bug_status VARCHAR(20) DEFAULT 'NEW',
description TEXT
...
FOREIGN KEY (bug_status) REFERENCES BugStatusTypes (bug_status)
);
CREATE TABLE BugStatusTypes (
bug_status VARCHAR(20) PRIMARY KEY,
description TEXT,
is_active BOOL
);
You can make a foreign key on a string column, referencing a string primary key of the lookup table. Then the values in the main table are restricted to the allowed values, but you don't need to do a join every time you want to display the human-readable value instead of an INT
primary key.
Lookup tables also allow you to add descriptive text to explain the usage of each value in the domain (I showed a description
column in the example lookup table above). You can't do that if you avoid using a lookup table, or if you use an ENUM
type.
Likewise I added an is_active
attribute in the lookup table, which allows you to "retire" certain values so you know they shouldn't be presented in the UI anymore, though you may need to keep them in the lookup so historical records in the main table can still reference the value. This is an example of associating extra attributes to values in the domain. Another thing you can't do unless you use a lookup table.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With