Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recommended SQL database design for tags or tagging [closed]

I've heard of a few ways to implement tagging; using a mapping table between TagID and ItemID (makes sense to me, but does it scale?), adding a fixed number of possible TagID columns to ItemID (seems like a bad idea), Keeping tags in a text column that's comma separated (sounds crazy but could work). I've even heard someone recommend a sparse matrix, but then how do the tag names grow gracefully?

Am I missing a best practice for tags?

like image 294
dlamblin Avatar asked Aug 21 '08 19:08

dlamblin


People also ask

What is SQL tagging?

The query tag is used to perform an SQL query that returns a result set. For parameterized SQL queries, you use a nested param tag inside the query tag. In catalog. jsp , the value of the Add request parameter determines which book information should be retrieved from the database.

What is tagging in database?

In information systems, a tag is a keyword or term assigned to a piece of information (such as an Internet bookmark, multimedia, database record, or computer file). This kind of metadata helps describe an item and allows it to be found again by browsing or searching.


2 Answers

Three tables (one for storing all items, one for all tags, and one for the relation between the two), properly indexed, with foreign keys set running on a proper database, should work well and scale properly.

Table: Item Columns: ItemID, Title, Content  Table: Tag Columns: TagID, Title  Table: ItemTag Columns: ItemID, TagID 
like image 74
Yaakov Ellis Avatar answered Sep 28 '22 10:09

Yaakov Ellis


Normally I would agree with Yaakov Ellis but in this special case there is another viable solution:

Use two tables:

Table: Item Columns: ItemID, Title, Content Indexes: ItemID  Table: Tag Columns: ItemID, Title Indexes: ItemId, Title 

This has some major advantages:

First it makes development much simpler: in the three-table solution for insert and update of item you have to lookup the Tag table to see if there are already entries. Then you have to join them with new ones. This is no trivial task.

Then it makes queries simpler (and perhaps faster). There are three major database queries which you will do: Output all Tags for one Item, draw a Tag-Cloud and select all items for one Tag Title.

All Tags for one Item:

3-Table:

SELECT Tag.Title    FROM Tag    JOIN ItemTag ON Tag.TagID = ItemTag.TagID  WHERE ItemTag.ItemID = :id 

2-Table:

SELECT Tag.Title FROM Tag WHERE Tag.ItemID = :id 

Tag-Cloud:

3-Table:

SELECT Tag.Title, count(*)   FROM Tag   JOIN ItemTag ON Tag.TagID = ItemTag.TagID  GROUP BY Tag.Title 

2-Table:

SELECT Tag.Title, count(*)   FROM Tag  GROUP BY Tag.Title 

Items for one Tag:

3-Table:

SELECT Item.*   FROM Item   JOIN ItemTag ON Item.ItemID = ItemTag.ItemID   JOIN Tag ON ItemTag.TagID = Tag.TagID  WHERE Tag.Title = :title 

2-Table:

SELECT Item.*   FROM Item   JOIN Tag ON Item.ItemID = Tag.ItemID  WHERE Tag.Title = :title 

But there are some drawbacks, too: It could take more space in the database (which could lead to more disk operations which is slower) and it's not normalized which could lead to inconsistencies.

The size argument is not that strong because the very nature of tags is that they are normally pretty small so the size increase is not a large one. One could argue that the query for the tag title is much faster in a small table which contains each tag only once and this certainly is true. But taking in regard the savings for not having to join and the fact that you can build a good index on them could easily compensate for this. This of course depends heavily on the size of the database you are using.

The inconsistency argument is a little moot too. Tags are free text fields and there is no expected operation like 'rename all tags "foo" to "bar"'.

So tldr: I would go for the two-table solution. (In fact I'm going to. I found this article to see if there are valid arguments against it.)

like image 33
Scheintod Avatar answered Sep 28 '22 12:09

Scheintod