Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Database design - relations vs properties

I have a problem with designing database (SQL/MySQL). Let's assume we have a user, user can have many friends and many posts and filled some data about himself.

It's quite obvious that for friends we need one pivot_table for n:n relation, for posts we need to create one extra table with user_id (1:n) relation.

So we need users, user_friends and posts tables. This is obvious. This is how relations should be handled.

But now let's assume we want for users to have the following data:

name - text
description - text
marital status - select only one from list
favourite colour - select only one from list
hobby - select up to 3 from list

For text fields (name, description) it's really obvious we simply create varchar/text columns in users table and that's it.

The general question is: how the other fields (chosen from lists) should be handled? Should I create relations for them or maybe should I create standard data columns with them?

In my opinion there is no point to create relation tables for that because using lists (select) we only limit user when he can in fact paste into database. In theory we could allow user to manually input as favourite colour his colour (for example red and if he types something wrong for example reds we would compare it will list of allowed colours). The same would be for gender - there is no point in my opinion to create extra table when we hold only woman and man and create relation for it.

First DB design:

I could for example create the following columns for properties:

marital_status - int
fav_colour - int
hobby_1 - int
hobby_2 - int
hobby_3 - int

And have one other table (or even plain array in PHP or other language) where I store that value 1 for fav_colour is for example red, value 2 for hobby is music and so on (it doesn't matter how I store those values here - I could also use enum type for that) .

For me benefits for such attitude is not creating many relations that are in fact rather properties and not relations (as I mentioned above), so less work + easier getting information about user - you don't need to use any joins what would be important if you have for user for example 20 or 100 such properties and I can search in user table very easy. Disadvantages are also quite obvious - data is not normalized, for any multi selection (as for example hobby) I need to create 3 columns and if in future I decide that user can select not 1 colour but 2 or 3, I would need to add 2 extra columns.

Alternative DB design:

I create extra tables: colours, hobbies, marital_statuses and I create 3 pivots tables: user_colours, user_hobbies, user_marital_statuses. Disadvantages: many joins. Advantages - if I created 3 extra pivot tables I I could easily allow user to select up to 10 colours and I don't need redesign database at all. But disadvantages also occur - difficult searching, a lot of work, many joins.

Detailed question

So to sum up - what solution would be better assuming:

  1. I would probably not change the maximum count of one properties (if I decided I allow maximum 3 hobbies, this won't probably ever change)
  2. Lists of choices for many fields would be relative short (for most of them less than 10)
  3. I need to search a lot in such database. Someone for example wants to search user that have fav_colour set to red and have hobby music.

If there are any other solutions or advantages/disadvantages you see I appreciate to share with me.

like image 868
Marcin Nabiałek Avatar asked Oct 15 '14 18:10

Marcin Nabiałek


People also ask

What are the 3 types of relationships in a database?

There are 3 different types of relations in the database: one-to-one. one-to-many, and. many-to-many.

What is relation database design?

Relational database design (RDD) models information and data into a set of tables with rows and columns. Each row of a relation/table represents a record, and each column represents an attribute of data. The Structured Query Language (SQL) is used to manipulate relational databases.

What is the difference between relation and relationship in database?

The main difference between relation and relationship in DBMS is that relation refers to a table in a relational model based database while relationship refers to how two tables are connected together in a relational model based database. A database is a collection of related data.


2 Answers

It sounds like you want to enforce some constraints on some of your users properties. For example, favorite colour must be one of red, green, blue, pink, orange, etc; marital status must be one of single, divorced, married.

You've described one way to do this: lookup tables. This is the best way if the possible values are dynamic and require ongoing maintenance, or if there are many possible values. From your description, that is not your situation. Your possible values will be quite static and short.

I recommend employing a sql CHECK constraint. With it, you can control a field's possible values. For example:

CREATE TABLE users
(
Name varchar(255) NOT NULL,
Description varchar(255),
Marital_Status varchar(10) NOT NULL,
Color varchar(10) NOT NULL,
CONSTRAINT chk_Color CHECK (Color in ('Red', 'Blue', 'Green', 'Orange')),
CONSTRAINT chk_Marriage CHECK (Marital_Status in ('Single', 'Married', 'Divorced'))
)

I haven't syntax checked this DDL statement, so it may contain punctuation errors. Also, the syntax may vary for your particular DBMS. I think this should work for MySQL.

like image 125
Eric Avatar answered Sep 30 '22 13:09

Eric


If users can change favorite colors/hobbies frequently, I would use lookup tables, in my example I'll refer to them as decode tables. All of the relationships between user/hobbies and user/colors will be found in that decode table.

Since you can only have 1 marital status, that is easy to handle it's a 1 to many relationship.

Create a table Marital_Status with 2 fields, Id (pk) and Status(varchar(n)) The decode table won't be required to lookup marital status.

Now I would recommend creating a table to hold colors and a table for hobbies. Same way we did marital status.

Hobbies

HobbyId, Hobby

Colors
ColorId, Color

Whenever you need to add/delete a new hobby/color do it in these decode tables.

It's up to you whether you want to use 1 decode table for each relationship or many ie. Hobby_Decode and Color_Decode etc.

I'll explain the scenario of using 1.

Create your decode table with the following fields...

Decode

Item_Type varchar(n) --We will push either Hobby or Color in this field

UserId int --self explanatory, holds the Id of the User to "lookup"

LookupId --will hold id's of either the Hobby or the Color

Let me create some sample data, and we will work off of that.

Hobbies table data

 | HobbyId | Hobby

      1      Studying 
      2      Doing Drugs
      3      Drinking     

Colors table data

 | ColorId | Color

     1        Red 
     2        Blue

While we're at it, Here's our user table.

Users

 | UserId | Name

      1     Marcin 
      2     CSharper

I like drinking, doing drugs and the color red, You're a nerd so you like to study and the color Blue. In our decode table we'll add the following entries to represent that.

Decode

 | Item_Type| UserId | LookUpId

    'Hobby'      2        2
    'Hobby'      2        3
    'Color'      2        1
    'Hobby'      1        1
    'Color'      1        2      

Looking at that decode table doesn't really tell us anything. Once we join our decode table to colors/hobbies it will be evident.

If you want to look up all of my hobbies and my favorite colors the query will look this

note: this is SQL Server syntax not mysql.

--Pull Hobbies
Select u.Name, dH.Item_Type as 'Favorite', h.Hobby as 'Item'
from User u
inner join decode dH on dH.UserId = u.UserId 
                     and dH.Item_Type = 'Hobby'
inner join Hobby h on h.HobbyId = dH.LookUpId
where u.UserId = 2 

--Union in Colors
Union

Select u.Name, dH.Item_Type as 'Favorite', h.Hobby 'Item'
from User u
inner join decode dC on dH.UserId = u.UserId 
                     and dH.Item_Type = 'Color'
inner join Color c on c.ColorId = dH.LookUpId
where u.UserId = 2 

Your output will look like

|    Name    |    Favorite   |     Item 

   CSharper         Hobby         Drinking
   CSharper         Hobby         Doing Drugs
   CSharper         Color         Red

If it is setup like this, than it is extremely easy to change/update peoples favorite hobbies and colors. The decode table will handle all of that. It just takes a simple entry or deletion of that table. And also this way, a User can have an infinite amount of favorite hobbies and colors since it's the decode table that drives that, and not the Users table definition.

Manipulating your sample query a little bit, if we want to find all Users who like the color blue and drinking the query would look like.

Select u.Name
from User u 
inner join decode d on d.UserId = u.UserId
inner join Hobby h on h.HobbyId = d.LookUpId and d.Item_Type = 'Hobby'
inner join Color c on C.ColorId = d.LookUpId and d.Item_Type = 'Color'
where h.Hobby = 'drinking' and c.Color = 'blue'

Doing joins like this is perfectly acceptable.

like image 32
CSharper Avatar answered Sep 30 '22 15:09

CSharper