Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mongo schema: Todo-list with groups

Tags:

mongodb

nosql

I want to learn mongo and decided to create a more complex todo-application for learning purpose.

The basic idea is a task-list where tasks are grouped in folders. Users may have different access to those folders (read, write) and tasks may be moved to other folders. Usually (especially for syncing) tasks will be requested by-folder and not alone.

Basically I thought about three approaches and would like to hear your opinion for them. Maybe I missed some points or just have the wrong way of thinking.

A - List of References

  • Collections: User, Folder, Task
  • Folders contain references to Users
  • Folders contain references to Tasks

Problem

  • When updating a Task a reference to Folder is needed. Either those reference is stored within the Task (redundancy) or it must be passed with each API-call.

B - Subdocuments

  • Collections: User, Folder
  • Folders contain references to Users
  • Tasks are subdocuments within Folders

Problem

  • No way to update a Task without knowing the Folder. Both need to be transmitted as well but there is no redundancy compared to A.

C - References

  • Collections: User, Folder, Task
  • Folders contain references to Users
  • Taskskeep a reference to their Folders

Problem

  • Requesting a folder means searching in a long list instead of having direct references (A) or just returning the folder (B).
like image 595
K. D. Avatar asked Aug 19 '14 12:08

K. D.


1 Answers

If you don't need any metadata for the folder except the name you could also go with:

  • Collections: User,Task
  • Task has field folder
  • User has arrays read_access and write_access

Then

  • You can get a list of all folders with

    db.task.distinct("folder")

  • The folder a specific user can access are automatically retrieved when you retrieve the user document so those can basically known at login.

  • You can get all tasks a user can read with

    db.task.find( { folder: { $in: read_access } } )

    with read_access beeing the respective array you got from your users document. The same with write_access.

  • You can find all tasks within a folder with a simple find query for the folder name.

  • Renaming a folder can be achieved with one update query on each of the collections.

  • Creating a folder or moving a task to another folder can also be achieved in simple manners.

So without metadata for folders that is what I would do. If you need metadata for folders it can become a little more complicated but basically you could manage those independent of the tasks and users above using a folder collection containing the metadata with _id beeing the folder name referenced in user and task.


Edit:

Comparison of the different approaches

Stumbled over this link which might be of interest for you. In there is a discussion of transitioning from a relational database model to mongo. The difference beeing that in a relational database you usually try to go for third normal form where one of the goals is to avoid bias to any form of access pattern where in mongodb you can try to model your data to best fit your access patterns (while keeping in mind not to introduce possible data anomalies through redundancy).

So with that in mind:

  • your model A is a way how you could do it in a relational database (each type of information in one table referenced over id)
  • model B would be tailored for an access pattern where you always list a complete folder and tasks are only edited when the folder is opened (if you retrieve one folder you have all the task without an additional query)
  • C would be a different relational model than A and I think little closer to third normal form (without knowing the exact tables)
  • My suggestion would support the folder access not as optimal as B but would make it easier to show and edit single tasks

Problems that could come up with the schemas: Since A and C are basically relational you can get a problem with foreign keys since mongodb does not enforce foreign key constraints (e.g. you could delete a folder while there are still tasks referencing it in C or a task without deleting its reference in the folder in A). You could circumvent this problem by enforcing it from the application. For B the 16MB document limit could become a problem circumventable by allowing folders to split into multiple document when they reach a certain task count.

So new conclusion: I think A and C might not show you the advanatages of mongodb (and might even be more work to build in mongodb than in sql) since they are what you would do on a traditional relational database which is the way mongodb was not designed for (e.g. the missing join statement, no foreign key constraints). In sum B most matches your access patern "Usually (especially for syncing) tasks will be requested by-folder" while still allowing to easily edit and move tasks once the folder is opened.

like image 188
Trudbert Avatar answered Nov 17 '22 21:11

Trudbert