Update 2009.04.24
The main point of my question is not developer confusion and what to do about it.
The point is to understand when delimited values are the right solution.
I've seen delimited data used in commercial product databases (Ektron lol).
SQL Server even has an XML datatype, so that could be used for the same purpose as delimited fields.
/end Update
The application I'm designing has some many-to-many relationships. In the past, I've often used associative tables to represent these in the database. This has caused some confusion to the developers.
Here's an example DB structure:
Document
---------------
ID (PK)
Title
CategoryIDs (varchar(4000))
Category
------------
ID (PK)
Title
There is a many-to-many relationship between Document and Category.
In this implementation, Document.CategoryIDs is a big pipe-delimited list of CategoryIDs.
To me, this is bad because it requires use of substring matching in queries -- which cannot make use of indexes. I think this will be slow and will not scale.
With that model, to get all Documents for a Category, you would need something like the following:
select * from documents where categoryids like '%|' + @targetCategoryId + '|%'
My solution is to create an associative table as follows:
Document_Category
-------------------------------
DocumentID (PK)
CategoryID (PK)
This is confusing to the developers. Is there some elegant alternate solution that I'm missing?
I'm assuming there will be thousands of rows in Document. Category may be like 40 rows or so. The primary concern is query performance. Am I over-engineering this?
Is there a case where it's preferred to store lists of IDs in database columns rather than pushing the data out to an associative table?
Consider also that we may need to create many-to-many relationships among documents. This would suggest an associative table Document_Document. Is that the preferred design or is it better to store the associated Document IDs in a single column?
Thanks.
A typical example of a many-to many relationship is one between students and classes. A student can register for many classes, and a class can include many students. The following example includes a Students table, which contains a record for each student, and a Classes table, which contains a record for each class.
When you have a many-to-many relationship between dimension-type tables, we provide the following guidance: Add each many-to-many related entity as a model table, ensuring it has a unique identifier (ID) column. Add a bridging table to store associated entities. Create one-to-many relationships between the three tables.
When you need to establish a many-to-many relationship between two or more tables, the simplest way is to use a Junction Table. A Junction table in a database, also referred to as a Bridge table or Associative Table, bridges the tables together by referencing the primary keys of each data table.
A many-to-many relationship refers to a relationship between tables in a database when a parent row in one table contains several child rows in the second table, and vice versa. Many-to-many relationships are often tricky to represent.
This is confusing to the developers.
Get better developers. That is the right approach.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With