According to Five Simple Database Design Errors You Should Avoid by Anith Sen, using a common-lookup table to store the possible statuses for an entity is a common mistake.
Edit + Answer: The figures in Anith's article aren't well labelled - I thought both Figure 1 and Figure 2 are examples of bad design, whereas Figure 2 is good design. Phew, got worried there for a moment.
I'll keep my question below for reference.
"You lose the means to ensure accurate data; constraints. By combining different entities into a single table, you have no declarative means to restrain values of a certain category."
How is constraining values losing accuracy?"You are forced to represent every data type as a string with this type of generic lookup table."
If I want to represent another data type, I can add a column for it to my lookup table."You commit yourself to rigidity and subsequent complexity."
How?Fourthly and finally, you are faced with the physical implementation issues.
I don't see why.
I disagree with most of the reasons given and would like some objective critique on my incorrect? logic.
Citing the example of jobs at a repair service with many possible statuses that generally have a natural flow, let's take a JobStatus
table:
Arguably, some of these statuses can be normalised to tables like Couriered Items
, Completed Jobs
and Quotes
(with Pending/Accepted/Rejected statuses), but that feels like unnecessary schema complication.
Another common example would be an OrderStatus
table to restrict the status of an order:
The status titles and descriptions are in one place for editing and are easy to scaffold as a drop-down with a foreign key for dynamic data applications. This has worked well for me in the past. If the business rules dictate the creation of a new order status, I can just add it to OrderStatus
table, without rebuilding my code.
Why is this a bad practice?
Edits: I added Anith's reason to my question and tried to remain objective.
--
Lookup tables provide a way to add extra information in an event. A lookup table consists of a list of keys and values. You define a lookup table using the table function, and access the table using the lookup function. You can create a lookup table in the rules file or in a separate file.
Answer and Explanation: Storing all data in one single table will be confusing, may have security issues and there will be duplication in recording.
Lookup tables provide the essential function of helping you maintain data integrity in your database environment. For example, if you have users entering their gender into a data item, the table that contains the Gender item can reference a lookup table to verify that only the value M or F is used.
A lookup associates values used by one application for a specific field to the values used by other applications for the same field. This provides the capability to map values across vocabularies or systems. For example, you can map country codes, city codes, currency codes, and so on.
What Anith Sen is advising against is having a single lookup table for all lookup codes. That is the significance of the category
column in his example. Having a separate table for each category is definitely the way to go.
This is because:
In your examples JobStatus and OrderStatus are separate categories. applicable to separate entities. That is why they need different look-up tables. There is not even a problem with sharing the same code table across several different data tables. Where it becomes problematic is when we have separate data tables (entities) for which some statuses are not appropriate: that is the time to split the codes out into separate look-up tables.
edit
I see you have edited your post to cite all of Anith's points. I think the most important point one is the first one, regarding constraints. If you want to restrict the ORDERS.STATUS column to have values from the OrderStatus category then you have to have a separate table to enforce a foreign key. Your alternatives are:
All of those options suck, from the database perspective.
You already have a correct answer, so this remark is extra.
The big problem with OTLT (One true lookup table) is that you end up placing values from different domains in the same column, and then using a separate column to disambiguate.
In your examples, you have used numbers alongside each status description. If those numbers are numeric codes, as I think they must be, then you do not want the value 4, meaning "waiting for client confirmation" in the same column as the value 4, meaning "Cancelled". If you do this, then you can't use this column as the PK for your one true lookup table.
If you give your one true lookup table another column, call it "CodeType" and use "CodeType" and "Code" as a compound PK, you have introduced more complexity than you introduce by having a separate lookup table for each code type.
The short answer is this: don't put values from different domains in the same column. It always causes more trouble than it saves.
Incidentally, it's possible to create a view that combines all the separate lookup tables into what appears to be a single giant lookup table. This can be useful in certain very unusual situations.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With