Should I normalize my DB or not?

Tags:

When designing a schema for a DB (e.g. MySQL) the question arises whether or not to completely normalize the tables.

On one hand joins (and foreign key constraints, etc.) are very slow, and on the other hand you get redundant data and the potential for inconsistency.

Is "optimize last" the correct approach here? i.e. create a by-the-book normalized DB and then see what can be denormalized to achieve the optimal speed gain.

My fear, regarding this approach, is that I will settle on a DB design that might not be fast enough - but at that stage refactoring the schema (while supporting existing data) would be very painful. This is why I'm tempted to just temporarily forget everything I learned about "proper" RDBMS practices, and try the "flat table" approach for once.

Should the fact that this DB is going to be insert-heavy effect the decision?

1000

asked Jun 01 '09 12:06

Assaf Lavie

1 Answers

A philosophical answer: Sub-optimal (relational) databases are rife with insert, update, and delete anomalies. These all lead to inconsistent data, resulting in poor data quality. If you can't trust the accuracy of your data, what good is it? Ask yourself this: Do you want the right answers slower or do you want the wrong answers faster?

As a practical matter: get it right before you get it fast. We humans are very bad at predicting where bottlenecks will occur. Make the database great, measure the performance over a decent period of time, then decide if you need to make it faster. Before you denormalize and sacrifice accuracy try other techniques: can you get a faster server, connection, db driver, etc? Might stored procedures speed things up? How are the indexes and their fill factors? If those and other performance and tuning techniques do not do the trick, only then consider denormalization. Then measure the performance to verify that you got the increase in speed that you "paid for". Make sure that you are performing optimization, not pessimization.

[edit]

Q: So if I optimize last, can you recommend a reasonable way to migrate data after the schema is changed? If, for example, I decide to get rid of a lookup table - how can I migrate existing databased to this new design?

A: Sure.

Make a backup.
Make another backup to a different device.
Create new tables with "select into newtable from oldtable..." type commands. You'll need to do some joins to combine previously distinct tables.
Drop the old tables.
Rename the new tables.

BUT... consider a more robust approach:

Create some views on your fully normalized tables right now. Those views (virtual tables, "windows" on the data... ask me if you want to know more about this topic) would have the same defining query as step three above. When you write your application or DB-layer logic, use the views (at least for read access; updatable views are... well, interestsing). Then if you denormalize later, create a new table as above, drop the view, rename the new base table whatever the view was. Your application/DB-layer won't know the difference.

There's actually more to this in practice, but this should get you started.

119

answered Sep 19 '22 21:09

Alan

Related questions
                            
                                How to fetch the first and last record of a grouped record in a MySQL query with aggregate functions?
                            
                                Error: Handshake inactivity timeout in Node.js MYSQL module
                            
                                create if not exists view?
                            
                                MySQL ON UPDATE CURRENT_TIMESTAMP not updating
                            
                                Why is the estimated rows count very different in phpmyadmin results?
                            
                                How to do a Left Outer join with Laravel?
                            
                                how to select distinct value from multiple tables
                            
                                How to perform grouped ranking in MySQL
                            
                                View grants in MySQL
                            
                                What is a Parent table and a Child table in Database?
                            
                                MySQL select where JSON field property has value
                            
                                MySQL: Count records from one table and then update another
                            
                                count without group
                            
                                How to check if a MySQL query using the legacy API was successful?
                            
                                When *not* to use prepared statements?
                            
                                MySql ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO)
                            
                                mySQL dateTime range Query Issue
                            
                                Linux alternatives to Sequel Pro? (GUI based sql navigator) [closed]
                            
                                Laravel eloquent where date is equal or greater than DateTime
                            
                                mysqldump: Couldn't execute. Unknown table 'column_statistics' in information_schema

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Should I normalize my DB or not?

Tags:

database

optimization

mysql

database-normalization

rdbms

Assaf Lavie

People also ask

1 Answers

Alan

Recent Activity

Donate For Us