Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should a tag be it's own resource or a nested property?

I am at a crossroads deciding whether tags should be their own resource or a nested property of a note. This question touches a bit on RESTful design and database storage.

Context: I have a note resource. Users can have many notes. Each note can have many tags.

Functional Goals: I need to create routes to do the following:
1) Fetch all user tags. Something like: GET /users/:id/tags
2) Delete tag(s) associated with a note.
3) Add tag to a specific note.

Data/Performance Goals
1) Fetching user tags should be fast. This is for the purpose of "autosuggest"/"autocomplete".
2) Prevent duplicates (as much as possible). I want tags to be reused as much as possible for the purpose of being able to query data by tag. For example, I'd like to mitigate scenarios where the user types a tag such as "superheroes" when the tag "superhero" already exists.

That being said, the way I see it, there are two approaches of storing tags on a note resource:

1) tags as nested property. For example:

type: 'notes',
attributes: {
  id: '123456789',
  body: '...',
  tags: ['batman', 'superhero'] 
}

2) tags as their own resource. For example:

type: 'notes',
data: {
  id: '123456789',
  body: '...',
  tags: [1,2,3] // <= Tag IDs instead of strings
}

Either one of the approaches above could work but I am looking for a solution that will allow scalability and data consistency (imagine a million notes and ten million tags). At this point, I am leaning toward option #1 since it is easier to cope with code wise but may not necessarily be the right option.

I am very interested in hearing some thoughts about the different approaches especially since I cannot find a similar questions on SO about this topic.

Update Thank you for the answers. One of the most important things for me is identifying why using one over the other is advantageous. I'd like the answer to include somewhat of a pro/con list.

like image 614
dipole_moment Avatar asked Dec 05 '16 01:12

dipole_moment


1 Answers

tl;dr

Considering your requirements, IMO you should store tags as resources and your API should return the notes with the tags as embedded properties.


Database design

Keep notes and tags as separate collections (or tables). Since you have many notes and many tags and considering the fact that the core functionality is dependent on searching/autocomplete on these tags, this will improve performance when searching for notes for particular tags. A very basic design can look like:

notes

{
    'id': 101,    // noteid
    'title': 'Note title',
    'body': 'Some note',
    'tags': ['tag1', 'tag2', ...]
}

tags

{
    'id': 'tag1',    // tagid
    'name': 'batman',
    'description': 'the dark knight',
    'related': ['tagx', 'tagy', ...],
    'notes': [101, 103, ...]
}

You can use the related property to handle duplicates by replacing tagx, tagy by similar tags.


API Design

1. Fetching notes for user:

GET /users/{userid}/notes

Embed the tags within the notes object when you handle this route at backend. The notes object your API send should look something like this:

{
    'id': 101,
    'title': 'Note title',
    'body': 'Some note',
    'tags': ['batman']    // replacing the tag1 by its name from tag collection
}

2. Fetching tags for user:

GET /users/{userid}/tags

If it's not required, you can skip on sending the notes property which contains the id for your notes.

3. Deleting tags for notes:

DELETE /users/{userid}/{noteid}/{tag}

4. Adding tags for notes:

PUT /users/{userid}/{noteid}/{tag}

Addressing the performance issues, fetching tags for user should be fast because you have a separate collection for the same. Also, handling duplicates will be simpler because you can simply add the similar tags (by id or name) into the related array. Hope this was helpful.


Why not to keep tags as nested property

  • The design is not as scalable as the previous case. If the tags are nested property and a tag has to be edited or some information has to be added, then it will require changes in all the notes since multiple notes can contain the same tag. Whereas, keeping the tags as resources, the same notes will be mapped with their ids and a single change would be required in the tags collection/table.

  • Handling duplicate tags might not be as simple as when keeping them as separate resources.

  • When searching for tags you will need to search for all the tags embedded inside every note. This adds overhead.


The only advantage of using tags as nested property IMO is it'll make it easier to add or delete tags for a particular note.

like image 81
Divyanshu Maithani Avatar answered Oct 16 '22 06:10

Divyanshu Maithani