I have articles on my site, and I would like to add tags which would describe each article, but I'm having problems with design mysql table for tags. I have two ideas:
So when I want tags for article with ID 1, I would run
SELECT ... FROM tags WHERE `article_id`=1;
But, I would also like to know 3 most similar articles by comparing tags, so if I have article which has tags "php,mysql,erlang", and 5 articles with tags: "php,mysql", "erlang,ruby", "php erlang", "mysql,erlang,javascript", I would choose 1., 3. and 4., since those 3 have most same tags with main article.
Also other question, what is the best way to get 10 "most used tags" ?
Generally, for this kind of many-to-many relationship, there are three tables :
article
" table
tag
" table
tags_articles
" table, which acts as a join table, and contains only :
id_article
: foreign key that points to an articleid_tag
: foreign key that points to a tag
This way, there is no duplication of any tag's data : for each tag, there is one, and only one, line in the tag
table.
And, for each article, you can have several tags (i.e. several lines in the tags_articles
table) ; and, of course, for each tags, you can have several articles.
Getting a list of tags for an article, with this idea, is a matter of an additionnal query, like :
select tag.*
from tag
inner join tags_articles on tag.id = tags_articles.id_tag
where tags_articles.id_article = 123
Getting the three "most similar" articles would mean :
Not tested, but an idea might be something that would look like this :
select article.id, count(*) as nb_identical_tags
from article
inner join tags_articles on tags_articles.id_article = article.id
inner join tag on tag.id = tags_articles.id_tag
where tag.name in ('php', 'mysql', 'erlang')
and article.id <> 123
group by article.id
order by count(*) desc
limit 3
Basically, you :
where
clause, without the group by
clause, there would be two lines for that articlegroup by article.id
, there will be only one line per article
count
, to find out how many tags each article has in common with the initial oneFirst off, you'll want to use Pascal MARTIN's suggestion about the table design.
As for finding similar articles, here's something to get you started. Given that @article_id is the article you want to find matches for, and @tag1, @tag2, @tag3 are the tags for that article:
SELECT article_id, count(*)
FROM tags_articles
WHERE article_id <> @article_id
AND tag_id IN (@tag1, @tag2, @tag3)
GROUP BY article_id
ORDER BY count(*) DESC
LIMIT 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With