Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to implement tag system

I was wondering what the best way is to implement a tag system, like the one used on SO. I was thinking of this but I can't come up with a good scalable solution.

I was thinking of having a basic 3 table solution: having a tags table, an articles tables and a tag_to_articles table.

Is this the best solution to this problem, or are there alternatives? Using this method the table would get extremely large in time, and for searching this is not too efficient I assume. On the other hand it is not that important that the query executes fast.

like image 577
Saif Bechan Avatar asked Nov 27 '09 19:11

Saif Bechan


People also ask

What is tagging in software development?

In programming, a tag is an argument to a subroutine that determines other arguments passed to it, which is used as a way to pass indefinite number of tagged parameters to the subroutine; notably, tags are used for a number of system calls in AmigaOS v2.

What is meant by tagging system?

In occupational health and safety, a tagging system is a system of recording and displaying the status of a machine or equipment, enabling staff to view whether it is in working order.

How do I manage tags in mysql?

you can store JSON array in mysql>=5.7 for tags like below : ["tag1","tag2",...] if you use an extra table for store tags , you need to join and search for adding a new tag , its bad way when we have too many tags in database !


1 Answers

I believe you'll find interesting this blog post: Tags: Database schemas

The Problem: You want to have a database schema where you can tag a bookmark (or a blog post or whatever) with as many tags as you want. Later then, you want to run queries to constrain the bookmarks to a union or intersection of tags. You also want to exclude (say: minus) some tags from the search result.

“MySQLicious” solution

In this solution, the schema has got just one table, it is denormalized. This type is called “MySQLicious solution” because MySQLicious imports del.icio.us data into a table with this structure.

enter image description hereenter image description here

Intersection (AND) Query for “search+webservice+semweb”:

SELECT * FROM `delicious` WHERE tags LIKE "%search%" AND tags LIKE "%webservice%" AND tags LIKE "%semweb%" 

Union (OR) Query for “search|webservice|semweb”:

SELECT * FROM `delicious` WHERE tags LIKE "%search%" OR tags LIKE "%webservice%" OR tags LIKE "%semweb%" 

Minus Query for “search+webservice-semweb”

SELECT * FROM `delicious` WHERE tags LIKE "%search%" AND tags LIKE "%webservice%" AND tags NOT LIKE "%semweb%" 

“Scuttle” solution

Scuttle organizes its data in two tables. That table “scCategories” is the “tag”-table and has got a foreign key to the “bookmark”-table.

enter image description here

Intersection (AND) Query for “bookmark+webservice+semweb”:

SELECT b.* FROM scBookmarks b, scCategories c WHERE c.bId = b.bId AND (c.category IN ('bookmark', 'webservice', 'semweb')) GROUP BY b.bId HAVING COUNT( b.bId )=3 

First, all bookmark-tag combinations are searched, where the tag is “bookmark”, “webservice” or “semweb” (c.category IN ('bookmark', 'webservice', 'semweb')), then just the bookmarks that have got all three tags searched for are taken into account (HAVING COUNT(b.bId)=3).

Union (OR) Query for “bookmark|webservice|semweb”: Just leave out the HAVING clause and you have union:

SELECT b.* FROM scBookmarks b, scCategories c WHERE c.bId = b.bId AND (c.category IN ('bookmark', 'webservice', 'semweb')) GROUP BY b.bId 

Minus (Exclusion) Query for “bookmark+webservice-semweb”, that is: bookmark AND webservice AND NOT semweb.

SELECT b. * FROM scBookmarks b, scCategories c WHERE b.bId = c.bId AND (c.category IN ('bookmark', 'webservice')) AND b.bId NOT IN (SELECT b.bId FROM scBookmarks b, scCategories c WHERE b.bId = c.bId AND c.category = 'semweb') GROUP BY b.bId HAVING COUNT( b.bId ) =2 

Leaving out the HAVING COUNT leads to the Query for “bookmark|webservice-semweb”.


“Toxi” solution

Toxi came up with a three-table structure. Via the table “tagmap” the bookmarks and the tags are n-to-m related. Each tag can be used together with different bookmarks and vice versa. This DB-schema is also used by wordpress. The queries are quite the same as in the “scuttle” solution.

enter image description here

Intersection (AND) Query for “bookmark+webservice+semweb”

SELECT b.* FROM tagmap bt, bookmark b, tag t WHERE bt.tag_id = t.tag_id AND (t.name IN ('bookmark', 'webservice', 'semweb')) AND b.id = bt.bookmark_id GROUP BY b.id HAVING COUNT( b.id )=3 

Union (OR) Query for “bookmark|webservice|semweb”

SELECT b.* FROM tagmap bt, bookmark b, tag t WHERE bt.tag_id = t.tag_id AND (t.name IN ('bookmark', 'webservice', 'semweb')) AND b.id = bt.bookmark_id GROUP BY b.id 

Minus (Exclusion) Query for “bookmark+webservice-semweb”, that is: bookmark AND webservice AND NOT semweb.

SELECT b. * FROM bookmark b, tagmap bt, tag t WHERE b.id = bt.bookmark_id AND bt.tag_id = t.tag_id AND (t.name IN ('Programming', 'Algorithms')) AND b.id NOT IN (SELECT b.id FROM bookmark b, tagmap bt, tag t WHERE b.id = bt.bookmark_id AND bt.tag_id = t.tag_id AND t.name = 'Python') GROUP BY b.id HAVING COUNT( b.id ) =2 

Leaving out the HAVING COUNT leads to the Query for “bookmark|webservice-semweb”.

like image 154
Nick Dandoulakis Avatar answered Oct 10 '22 19:10

Nick Dandoulakis