In my database, many tables have the 'State' field, representing the state that, that particular entity falls in. I have been told that we should use Lookup tables for this kind of thing, but I am unsure of the exact mechanism. Can someone clarify these points?
How is the integrity maintained? (i.e. how do I make sure that only the values from the state table go into the other tables?)
Does the state name go into the other tables, or does the state id from the state table go into the other tables?
1 - Integrity is maintained using what is called a FOREIGN KEY constraint. A reasonable scenario might have you do these two tables:
Table Name: STATE_CODE
ID DESCRIPTION
=================
1 Alabama
2 Arkansas
...
50 Wyoming
Table Name: CUSTOMER
=====================
CUST_ID CUST_NAME CUST_STATE
100 AAA Company 1 --they are in Alabama!
200 ZZZ Company 50 --they are in Wyoming!
This answers your question #2: The state codes, not the full names, go in the CUSTOMER table in this example.
A typical script to impose this kind of structure on an existing layout would be like this:
--first, create the lookup table
CREATE TABLE STATE_CODE(
ID INTEGER NOT NULL
,DESCRIPTION VARCHAR(100) NOT NULL
,PRIMARY KEY(ID)
);
--now add a reference to the lookup table inside your existing table
--the REFERENCES part will **force** entries
--to have a matching entry in STATE_CODE
ALTER TABLE CUSTOMER ADD STATE_CODE_ID REFERENCES STATE_CODE(ID);
And this answers your question #1: That "REFERENCES" command will create a Foreign Key constraint that will force all entries in CUSTOMER.STATE_CODE to have a corresponding entry in the STATE_CODE table. After setting this up, if someone were to try this:
INSERT INTO CUSTOMER(CUST_ID,CUST_NAME,CUST_STATE)
VALUES(9000,'Martians',74837483748);
Then they would get an error message, and that faulty data would never get entered (unless, of course, you really did have a state with a code of 74837483748).
Answers:
Integrity is maintained by foreign key constraints.
A foreign key constraint ensures that the only values the child table will allow in the specified column, come from the parent table's designated column.
For sake of join/various database operations, the smallest data type possible is recommended because the performance will be better.
For example, INT takes 4 Bytes while VARCHAR2(4+) takes more than that. From a performance perspective, it will be faster if you use INT than VARCHAR2(4+). But you do want two columns - one to serve as primary key while the other is the human readable description. This approach allows you to change the description without impacting existing records.
This leads to a discussion about artificial/surrogate and natural keys, for what is best to use as a primary key (and ultimately a foreign key).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With