Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Storing tags in a graph database

I've found some advice for setting up tagging systems in relational and document databases, but nothing for graph/multi-model databases.

I am trying to set up a tagging system for documents (let's call them "articles") in ArangoDB. I can think of two obvious ways to store tags in a multi-model (graph+document) database like Arango:

  • as an array within each article document (document database-style)
  • as a separate document class with each tag as a unique document and edges connecting tag documents to the article documents (something closer to relational database-style)

Are these in fact the two main ways to do this? Neither seems ideal. For example:

  • If I'm storing tags within each article document, I can index the tags and presumably ArangoDB is optimizing the space they use. However, I can't use graph features to link or traverse tags (or I have to do it separately).
  • If I'm storing tags as separate tag documents, it seems like extra overhead (an extra query) when I just want to get a list of tags on a document.

Which leads me to an explicit question: with regard to the latter option, is there any simple way to automatically make connected 'tag' documents show up within the article documents? E.g. have an array property that somehow 'mirrored' the tag.name properties of the connected tag documents?

General advice is also welcome.

like image 573
ropeladder Avatar asked Mar 02 '16 19:03

ropeladder


People also ask

How are tags stored in database?

Tags are stored in table "label" and once we add a tag to a ticket or anything, that gets stored in "label_entry" table.

How is data stored in graph database?

In a graph database, there are no JOINs or lookups. Relationships are stored natively alongside the data elements (the nodes) in a much more flexible format. Everything about the system is optimized for traversing through data quickly; millions of connections per second, per core.

What kind of data is stored in graph database?

Graph databases store data like object-oriented languages. Each object can maintain a collection of other objects it is related to. These references are usually pointers to objects in-memory, and we do not have to store them explicitly. Nor do we have to find the object in memory with some foreign key attribute.

Why is graph database not popular?

Additionally, they were considered to be “academic” databases, designed to build logical analysis systems, and not necessarily useful for business purposes. Though graph databases could provide useful results, in general they were complicated, time-consuming, and not terribly user-friendly.


1 Answers

You already mention most of the available decision criterias. Maybe I can add some more:

Relational tags inside the documents could use array indices to filter on them, which could make queries on them fast. However, if you like to add a rating or an explanation to each item of that tag array, there is no way to. If you want to count the documents tagged, this may also be more expensive than counting all edges that originate from a specific tag, or maybe find all tags matching a search criteria.

One of the powers of multi model is, that you don't need to decide between the both aproaches. You can have an edge collection connecting tags with attributes to your documents, and have an indexed array with the same (flat) tags inside of the document. If you find all (or most) of your queries just use one method, try to convert the rest and remove the other solution. If that doesn't work, your application simply needs both of them.

In both cases finding other tagged documents alongside could be done in a subequery:

LET docs=(FOR ftDoc IN FULLTEXT(articles, 'text', 'search')
    COLLECT tags = ftDoc.tags INTO tags RETURN {tags, ftDoc})
LET tags = FLATTEN(FOR t IN docs[*].tags RETURN t)
LET otherArticles = (FOR oneTag IN tags 
    FOR oneD IN articles FILTER oneTag IN oneD.tag RETURN oneD._key)
RETURN {articles: docs, tags: tags, otherArticles: otherArticles}
like image 149
dothebart Avatar answered Sep 21 '22 04:09

dothebart