I am completing normalization exercises from the web to test my abilities to normalize data. This particular problem was found at: https://cs.senecac.on.ca/~dbs201/pages/Normalization_Practice.htm (Exercise 1) The table this problem is based of is as follows: <img src="https://i.stack.imgur.com/P7vCu.png" alt="Table"> The unnormalized table that can be created from this table is: <img src="https://i.stack.imgur.com/IRNuC.png" alt="enter image description here"> To comply with First Normal form, I have to get rid of repeating fields in the table by moving visitdate, procedure_no, and procedure_name to their own respective tables: <img src="https://i.stack.imgur.com/bCk4g.png" alt="enter image description here"> This also complies with 2NF and 3NF which makes me question whether I have performed the process of normalization correctly. Please provide feedback if I did not properly move from UNF to 1NF.

In a first step you could create the following tables (assuming <code>pet_id</code> is unique in the table): <pre class="prettyprint"><code>Pets: pet_id, pet_name, pet_type, pet_age, owner Visits: pet_id, visit_date, procedure </code></pre> Going further you could split <code>procedure</code> since the description is repeating: <pre class="prettyprint"><code>Pets: pet_id, pet_name, pet_type, pet_age, owner Visits: pet_id, visit_date, procedure_id Procedures: procedure_id, description </code></pre> Although there can be multiple <code>procedures</code> on the same <code>visit_date</code> for the same <code>pet_id</code>, I see no reason to split those further: a date could (in theory) be stored in 2 bytes, and splitting that data would create more overhead (plus an extra index). You would also want to change <code>pet_age</code> to <code>pet_birth_date</code> since the age changes over time. Since this is the first exercise in your list, the above will probably be more than enough. Going even further: An <code>owner</code> can have multiple pets, so another table could be created: <pre class="prettyprint"><code>Pet_owners: owner_id, owner_name </code></pre> and then only use <code>owner_id</code> in the <code>Pets</code> table. In a real system there would be <code>customer_id, name, address, phone, email</code>, etc. - so that should always be in a separate table. You could even do the same for <code>pet_type</code> and store the <code>id</code> in 1 or 2 bytes, but it all depends on the type of queries you want to do later on the data.

Am I Properly Normalizing this Data

Tags:

database

database-normalization

I am completing normalization exercises from the web to test my abilities to normalize data. This particular problem was found at: https://cs.senecac.on.ca/~dbs201/pages/Normalization_Practice.htm (Exercise 1)

The table this problem is based of is as follows: Table

The unnormalized table that can be created from this table is:

enter image description here

To comply with First Normal form, I have to get rid of repeating fields in the table by moving visitdate, procedure_no, and procedure_name to their own respective tables:

enter image description here

This also complies with 2NF and 3NF which makes me question whether I have performed the process of normalization correctly. Please provide feedback if I did not properly move from UNF to 1NF.

765

asked May 24 '18 15:05

Zampanò

2 Answers

In a first step you could create the following tables (assuming pet_id is unique in the table):

Pets:   pet_id, pet_name, pet_type, pet_age, owner
Visits: pet_id, visit_date, procedure

Going further you could split procedure since the description is repeating:

Pets:       pet_id, pet_name, pet_type, pet_age, owner
Visits:     pet_id, visit_date, procedure_id
Procedures: procedure_id, description

Although there can be multiple procedures on the same visit_date for the same pet_id, I see no reason to split those further: a date could (in theory) be stored in 2 bytes, and splitting that data would create more overhead (plus an extra index).

You would also want to change pet_age to pet_birth_date since the age changes over time.

Since this is the first exercise in your list, the above will probably be more than enough.

Going even further:

An owner can have multiple pets, so another table could be created:

Pet_owners: owner_id, owner_name

and then only use owner_id in the Pets table. In a real system there would be customer_id, name, address, phone, email, etc. - so that should always be in a separate table.

You could even do the same for pet_type and store the id in 1 or 2 bytes, but it all depends on the type of queries you want to do later on the data.

184

answered Oct 02 '22 20:10

Danny_ds

The question is poorly presented. Look at the last two columns. The askers do not mean that each column's types are sets. They mean that pairs of values on the same line make an element of a set. They should have had one column whose values were triplets--date, number & name. That's what they did when they used just one column (the last one) for number & name. Notice that their solution in the pdf linked to by the page you link to has a table that has all three of date, number & name.

But how are you supposed to know that the values should be paired? After all if the date column gave the set of a pet's visit dates & the procedure column gave the set of procedure number & names a pet ever had then we wouldn't be supposed to take a pair of values on the same line as an element of a set. Unfortunately you are just supposed to magically guess correctly. (A hint is that the number of dates & number-name pairs for a pet are always the same.)

The above took the blank areas in the illustration to be there to make room for the vertical display of set-valued attributes; the portrayed table has 4 rows. But maybe they are there because you are supposed to get a relation from this illustration by interpreting a blank subrow as representing the most recent non-blank subrow. Then the table wouldn't have any set-valued columns; the portrayed table has 9 rows. It happens that this interpretation disagrees with the linked answer's UNF & 1NF sections.

If they weren't going to explain the table & were just relying on your guesses it would have been clearer if they put a visit's procedure date, number & name under one column--just as they put a procedure number & name in one column. But really, they should always tell you how to read the illustration. But really, you should always ask how read an illustration. If you have any interpretation conventions from a related course/textbook then you should have put it in your question for us to know.

Unfortunately "UNF" tables are almost always similarly poorly given without any description about how they are to be interpreted. Also "1NF" has no standard meaning & there is no standard notion of "normalizing to 1NF".

answered Oct 02 '22 19:10

philipxy

Related questions
                            
                                Opencart - Search regardless accent
                            
                                Implementing bulk record fetching
                            
                                Full path traversed for each ID with cycle Oracle
                            
                                Getting a working SpatiaLite + SQLite system for x64 c#
                            
                                Why change user type to SQL user with login is disabled in SSMS?
                            
                                Any scalable OLAP database (web app scale)?
                            
                                Need a thesaurus database [closed]
                            
                                Do DB indexes take same amount of disc space as column data?
                            
                                How to Develop TSQL in Visual Studio 2010 Database Projects
                            
                                Java API for SQL Data Definition Language [closed]
                            
                                'No database channel is available'
                            
                                django: select_related() on an already-existing object?
                            
                                heroku db:pull does not work?
                            
                                How to securely store database password in Python? [closed]
                            
                                Problematic nameless table in Postgresql
                            
                                MongoDB v2.4.9 sort by boolean field
                            
                                Limit number of records in aerospike select query
                            
                                MySQL Docker container is not saving data to new image
                            
                                Hibernate is 1000 times slower than sql query
                            
                                TypeORM updating entity/table

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With