Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it better to store redundant information or join tables when necessary in MySQL?

I have an online shop where users can have little shops with their own products. Each of this products can have questions associated to it and the owner of the shop has the ability to answer those questions. This information is stored in 3 tables a "Questions"(QuestionID,ProductID,...) table, a "Products"(ProductID,ShopID,...) table and a "Shop"(ShopID,OwnerID,...) table.

Is it better to have a ShopID in the 'Questions' table (to allow a shop owner to view all his questions) or to join those three tables to get Questions matching a certain Shop?

like image 871
Omar Kohl Avatar asked Jul 13 '10 12:07

Omar Kohl


People also ask

Why might it be a bad idea to have redundant data in a database?

Data redundancy occurs when the same piece of data exists in multiple places, whereas data inconsistency is when the same data exists in different formats in multiple tables. Unfortunately, data redundancy can cause data inconsistency, which can provide a company with unreliable and/or meaningless information.

Why is redundant data bad in SQL?

Redundant data is a bad idea because when you modify data (update/insert/delete), then you need to do it in more than one place. This opens up the possibility that the data becomes inconsistent across the database. The reason redundancy is sometimes necessary is for performance reasons.

Is redundancy good in databases?

Even though data redundancy can help minimize the chance of data loss, redundancy issues can affect larger data sets. For example, data that is stored in several places takes up valuable storage space and makes it difficult for the organization to identify which data they should access or update.

Are join tables necessary?

Using the SQL JOIN clause is necessary if you want to query multiple tables. Sooner or later, you'll have to use more than one table in a query. It's the nature of relational databases in general – they consist of data that's usually saved in multiple tables; in turn, these form a database.


2 Answers

It is almost always better to join and avoid redundant information. You should only denormalize when you must do so in order to meet a performance goal - and you can't know if you need to do this until you try with normalized tables first.

Note that denormalization helps in read performance at the expense of slowing down writes and making it easier for a coding mistake to cause data to be out of sync (since you're storing the same thing in more than one place you now have to be sure to update it all).

like image 56
Donnie Avatar answered Nov 12 '22 19:11

Donnie


Generally it is better to avoid redundant information. This seems like it should be quite a cheap join to do given appropriate indexes and I wouldn't denormalise in that manner unless I saw in the query plans that the JOIN was causing problems (perhaps because of the number of records in the tables)

You would also need to consider the ratio of reads to writes. Denormalisation will help the reads but add overhead to writes.

like image 29
Martin Smith Avatar answered Nov 12 '22 19:11

Martin Smith