What is normalization in MySQL and in which case and how we need to use it?
What Is Normalization in SQL? 1st Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF)
Normalization is the process of organizing data in a database. This includes creating tables and establishing relationships between those tables according to rules designed both to protect the data and to make the database more flexible by eliminating redundancy and inconsistent dependency.
Database normalization is the process of organizing data into tables in such a way that the results of using the database are always unambiguous and as intended. Such normalization is intrinsic to relational database theory.
In brief, normalization is a way of organizing the data in the database. Normalization entails organizing the columns and tables of a database to ensure that their dependencies are properly enforced by database integrity constraints. It usually divides a large table into smaller ones, so it is more efficient.
I try to attempt to explain normalization in layman terms here. First off, it is something that applies to relational database (Oracle, Access, MySQL) so it is not only for MySQL.
Normalisation is about making sure each table has the only minimal fields and to get rid of dependencies. Imagine you have an employee record, and each employee belongs to a department. If you store the department as a field along with the other data of the employee, you have a problem - what happens if a department is removed? You have to update all the department fields, and there's opportunity for error. And what if some employees does not have a department (newly assigned, perhaps?). Now there will be null values.
So the normalisation, in brief, is to avoid having fields that would be null, and making sure that the all the fields in the table only belong to one domain of data being described. For example, in the employee table, the fields could be id, name, social security number, but those three fields have nothing to do with the department. Only employee id describes which department the employee belongs to. So this implies that which department an employee is in should be in another table.
Here's a simple normalization process.
EMPLOYEE ( < employee_id >, name, social_security, department_name)
This is not normalized, as explained. A normalized form could look like
EMPLOYEE ( < employee_id >, name, social_security)
Here, the Employee table is only responsible for one set of data. So where do we store which department the employee belongs to? In another table
EMPLOYEE_DEPARTMENT ( < employee_id >, department_name )
This is not optimal. What if the department name changes? (it happens in the US government all the time). Hence it is better to do this
EMPLOYEE_DEPARTMENT ( < employee_id >, department_id ) DEPARTMENT ( < department_id >, department_name )
There are first normal form, second normal form and third normal form. But unless you are studying a DB course, I usually just go for the most normalized form I could understand.
Hope this helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With