Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does normalization really hurt performance in high traffic sites?

I am designing a database and I would like to normalize the database. In one query I will joining about 30-40 tables. Will this hurt the website performance if it ever becomes extremely popular? This will be the main query and it will be getting called 50% of the time. The other queries I will be joining about two tables.

I have a choice right now to normalize or not to normalize but if the normalization becomes a problem in the future I may have to rewrite 40% of the software and it may take me a long time. Does normalization really hurt in this case? Should I denormalize now while I have the time?

like image 949
Luke101 Avatar asked Apr 24 '10 00:04

Luke101


People also ask

Does normalization affect performance?

We use normalization to reduce the chances of anomalies that may arise as a result of data insertion, deletion, updation. Normalization doesnt necessarily increase performance.

What is the main disadvantage of normalization?

There are a few drawbacks in normalization : Creating a longer task, because there are more tables to join, the need to join those tables increases and the task become more tedious (longer and slower). The database become harder to realize as well.

Why over normalization can cause performance issues?

Normalization means minimizing redundancy in stored data. Instead you setup relationships (often with foreign constraints) between multiple tables. However, while normalization might lead to a smaller amount of stored data, often it creates performance problems because now many queries end up joining multiple tables.


2 Answers

I quote: "normalize for correctness, denormalize for speed - and only when necessary"

I refer you to: In terms of databases, is "Normalize for correctness, denormalize for performance" a right mantra?

HTH.

like image 192
Sunny Avatar answered Sep 19 '22 00:09

Sunny


When performance is a concern, there are usually better alternatives than denormalization:

  • Creating appropriate indexes and statistics on the involved tables
  • Caching
  • Materialized views (Indexed views in MS SQL Server)
  • Having a denormalized copy of your tables (used exclusively for the queries that need them), in addition to the normalized tables that are used in most cases (requires writing synchronization code, that could run either as a trigger or a scheduled job depending on the data accuracy you need)
like image 20
ckarras Avatar answered Sep 22 '22 00:09

ckarras