Here is a couple of (incomplete) database tables that store information about the rooms of a hotel. The information they store is the same, but their design is different:
Store floor information in a separate column:
| id | floor
|----|-------
| 1 | 1
| 2 | 1
| 3 | 2
| 4 | 2
Store floor information in IDs.
| id
|-----
| 101
| 102
| 201
| 202
Is it always a terrible idea to store semantic data in IDs the way table 2 does or are there cases where having more expressive IDs is valuable enough to justify it?
If you want to use a natural key, then use a natural key. Don't name a natural key id
.
If you use a synthetic key, treat it as an arbitrary value that must be unique, but has no other meaning.
This isn't really about semantics of data, it's about atomicity and whether you'll be violating the 1NF. The question you should ask yourself is:
Should the room number be treated as an atomic piece of data from the data management perspective?
In other words, will you always read from (and write to) the database the room number as a whole (regardless of whether you treat it as a whole in the client code)?
NOTE: I don't know if it's intentional or not, but your scenario (1) doesn't contain enough data to reconstruct the room number, so it models a different domain compared to the scenario (2) that does.
BTW, storing semantic data in a key is not at all a bad practice in and of itself. If some attribute or combination of attributes has to be unique, then you must create a key on them, whether they have intrinsic meaning or not. You can't replace that key with a "surrogate" key, you can just add the surrogate (which has its pros and cons, as you can imagine).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With