Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Database Normalization

I'm new to database design and I have been reading quite a bit about normalization. If I had three tables: Accommodation, Train Stations and Airports. Would I have address columns in each table or an address table that is referenced by the other tables? Is there such a thing as over-normalization?

Thanks

like image 823
showFocus Avatar asked Jul 19 '10 15:07

showFocus


People also ask

What is Normalisation in a database?

Normalization is the process of organizing data in a database. This includes creating tables and establishing relationships between those tables according to rules designed both to protect the data and to make the database more flexible by eliminating redundancy and inconsistent dependency.

What are the four 4 types of database normalization?

First Normal Form (1 NF) Second Normal Form (2 NF) Third Normal Form (3 NF) Boyce Codd Normal Form or Fourth Normal Form ( BCNF or 4 NF)

What is normalization 1NF 2NF 3NF?

Following are the various types of Normal forms: A relation is in 1NF if it contains an atomic value. 2NF. A relation will be in 2NF if it is in 1NF and all non-key attributes are fully functional dependent on the primary key. 3NF. A relation will be in 3NF if it is in 2NF and no transition dependency exists.

What is database normalization in SQL?

Normalization is the process to eliminate data redundancy and enhance data integrity in the table. Normalization also helps to organize the data in the database. It is a multi-step process that sets the data into tabular form and removes the duplicated data from the relational tables.


1 Answers

Database Normalization is all about constructing relations (tables) that maintain certain functional dependencies among the facts (columns) within the relation (table) and among the various relations (tables) making up the schema (database). Bit of a mouth-full, but that is what it is all about.

A Simple Guide to Five Normal Forms in Relational Database Theory is the classic reference for normal forms. This paper defines in simple terms what the essence of each normal form is and its significance with respect to database table design. This is a very good "touch-stone" reference.

To answer your specific question properly requires additional information. Some critical questions you have to ask are:

  • Is an Address a simple fact (e.g. blob of text) or a composite fact (e.g. composed of multiple attributes: Address line, City Name, Postal Code etc.)
  • What are the other "facts" relating to "Accommodation", "Airport" and "Train Station"?
  • What sets of "facts" uniquely and minimally identify an "Airport", an "Accommodation" and a "Train Station" (these facts are typically called a key or candidate key)?
  • What functional dependencies exist among Address facts and the facts composing each relations key?

All this to say, the answer to your question is not as straight forward as one might hope for!

Is there such a thing as "over normalization"? Maybe. This depends on whether the functional dependencies you have identified and used to build your tables are of significance to your application domain.

For example, suppose it was determined that an address was composed of multiple attributes; one of which is postal code. Technically a postal code is a composite item too (at least Canadian Postal Codes are). Further normalizing your database to recognize these facts would probably be an over-normalization. This is because the components of a postal code are irrelevant to your application and therefore factoring them into the database design would be an over-normalization.

like image 144
NealB Avatar answered Sep 19 '22 17:09

NealB