Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Table linking... is it better to have a linking table, or a delimited column?

My database has two tables, one contains a list of users, the other a list of roles. Each user will belong to one or more roles, and of course each role will have multiple users in it.

I've come across two ways to link the information. The first is to add a third table which contains the ID's from both tables. A simple join will then return all the users that belong to a role, or all the roles to which a user belongs. However, as the database grows, the datasets returned by these simple queries will grow exponentially.

The second method is to add a column to the users table in which a delimited list of roles is stored. This will eliminate the need for the third linking table, which may have a positive effect on database growth. The downside is that SQL does not have the ability to use delimited lists. The only way I've found to process that information is to use a temporary table and a custom function.

Is viewing my execution plans, the "table scan" event is the one that takes the most resources. It makes sense that eliminating a table from the equation would speed things up. The function takes up less than 1% of the resources.

These tests were done on a database with less than 20 records. As the size of the database grows, the table scans will take longer, so perhaps limiting them is the best choice.

If using the delimited list is a good way to go, why is nobody doing it?

Please tell me which is your preferred method (even if it's different from my two) and why.

Thank you.

like image 828
RichieACC Avatar asked Jan 21 '10 15:01

RichieACC


People also ask

Why is it better to have multiple separate tables?

Storing all data in one single table will be confusing, may have security issues and there will be duplication in recording. Multiple table helps in recording the data in more organized manner when there are multiple users. The data can be stored as per the category and there will be less chances of duplication.

What is the purpose of a linking table?

Link tables are generally used for linking the two table or the fact tables. As we know that when we are designing the datamodel synthetic keys and circular loop are common. For fixing this problems we use the link table concept. You can also use concatenation ,but it always not give the appropiate result.

What is the purpose of linking two tables with some relationship?

A table relationship works by matching data in key fields — often a field with the same name in both tables. In most cases, these matching fields are the primary key from one table, which provides a unique identifier for each record, and a foreign key in the other table.

Which one is required to link two tables together?

SQL JOIN. A JOIN clause is used to combine rows from two or more tables, based on a related column between them.


1 Answers

If you have a delimited list, finding users with a given role is going to become very expensive: effectively, you need to do a FULL scan of that table, and look at all the values for that column in every row, trying to see if it contains a given role.

A separate table (normalized, many to many relation) is the way to go, and with proper indexing you will not have full scans happening.

eg:

User:  UserId, Name, ....
Role:  RoleId, Name, ....
UserRole:  UserRoleId, UserId, RoleId

(UserRoleId is optional, you could alternatively have the PK be UserId+RoleId, I won't get into the discussion here of surrogate vs compound keys here)

You'll want an index on (UserId, RoleId) that is UNIQUE, to enforce no duplicates. This will also help with any queries where you're trying to see if a specific user has a specific role (WHERE userId = x AND roleId = y)

If you are looking up all the roles a user has, you'll want an index on just UserId.

Conversely, if you are looking up all the users a given role has, an index on just roleId will speed that up. If you don't do this query, or do it very rarely, then not having this index will speed up performance slightly for insert/updates, as it is one less thing to do. This is the careful balancing act that is database tuning.

like image 66
gregmac Avatar answered Oct 27 '22 11:10

gregmac