Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Content tagging with MongoDB

I want to implement content tagging using MongoDB. In a relational database, the best approach would be to have a many-to-many relation between the content (say, "products") and tags tables. But what is best approach with NoSQL databases?

Would it be better to put every tag in a tags array of the "content" document, or put references to tags in a string?

like image 528
theZiki Avatar asked Dec 14 '12 16:12

theZiki


People also ask

What is tagging in Java and MongoDB?

In this tutorial, we'll take a look at a simple tagging implementation using Java and MongoDB. For those unfamiliar with the concept, a tag is a keyword used as a “label” to group documents into different categories. This allows the users to quickly navigate through similar content and it's especially useful when dealing with a big amount of data.

Why should I use MongoDB for content management?

People love using MongoDB for content management because it lets them store any kind of content, retrieve it, and change the schema as they go. New Data Types. MongoDB’s JSON document model and rich query language makes it easy to store and search different content types with different attributes in a single place.

How to reference a N-m relation in MongoDB?

In most cases where you have a n:m relation in MongoDB, you should use embedding instead of referencing. So I would recommend you to have an array "tags" in each product with the tag names. I assume that looking at a single product will be the most frequent use-case in your system.

How do I add tags to an untagged database cluster?

On a project’s dashboard, you can also hover over an untagged database cluster’s row of information to reveal the Add tags link. No matter which way you navigate, the Manage Tags dialog opens. Add tags by pressing SPACEBAR or ENTER after each term. Navigate between tags with the arrow keys, and remove the highlighted tag with BACKSPACE.


1 Answers

In most cases where you have a n:m relation in MongoDB, you should use embedding instead of referencing. So I would recommend you to have an array "tags" in each product with the tag names. I assume that looking at a single product will be the most frequent use-case in your system. This design will allow you to show the user a product with a list of tag names with a single database query.

When you need some additional meta-data about the tags which you don't want to bind to a product (like a long-text description of a tag), you could create an additional tags collection, where the name field gets an unique index for fast lookup and avoiding duplicates. When the user clicks on or hovers over a tag name, you can use an additional query to get the tag details.

A problematic case in this design is the situation when you want to delete or rename a tag. Then you have to edit every product which includes the tag. But because MongoDB doesn't know foreign keys with CASCADE ON DELETE like SQL databases, you will always have that problem when you have documents referencing one another.

Renaming tags could be made easier by storing objectIDs instead of names in the tag array of the product. But IDs have the disadvantage that they are useless for the user. You need to get the names of the tags to show a product page. That means that you have to request every single one from the tags collection, which requires an additional database query.

like image 108
Philipp Avatar answered Sep 19 '22 21:09

Philipp