Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you track record relations in NoSQL?

People also ask

Which type of NoSQL database is used to track entity relationship?

NoSQL databases come in a variety of types including document databases, key-values databases, wide-column stores, and graph databases. MongoDB is the world's most popular NoSQL database.

Does NoSQL use relational database?

NoSQL database doesn't use table to store the data like relational database. It is used for storing and fetching the data in database and generally used to store the large amount of data. It supports query language and provides better performance.

Does NoSQL have ER diagram?

NoSQL, unlike SQL which has ER and class diagrams, has neither names nor constraints for data modeling diagrams. The obvious reason is the relaxed rules of NoSQL about relationships, which aim to get a developer started with minimum requirements.

Is NoSQL a DBS graph?

Graph databases are commonly referred to as a NoSQL. Graph databases are similar to 1970s network model databases in that both represent general graphs, but network-model databases operate at a lower level of abstraction and lack easy traversal over a chain of edges.


All the answers for how to store many-to-many associations in the "NoSQL way" reduce to the same thing: storing data redundantly.

In NoSQL, you don't design your database based on the relationships between data entities. You design your database based on the queries you will run against it. Use the same criteria you would use to denormalize a relational database: if it's more important for data to have cohesion (think of values in a comma-separated list instead of a normalized table), then do it that way.

But this inevitably optimizes for one type of query (e.g. comments by any user for a given article) at the expense of other types of queries (comments for any article by a given user). If your application has the need for both types of queries to be equally optimized, you should not denormalize. And likewise, you should not use a NoSQL solution if you need to use the data in a relational way.

There is a risk with denormalization and redundancy that redundant sets of data will get out of sync with one another. This is called an anomaly. When you use a normalized relational database, the RDBMS can prevent anomalies. In a denormalized database or in NoSQL, it becomes your responsibility to write application code to prevent anomalies.

One might think that it'd be great for a NoSQL database to do the hard work of preventing anomalies for you. There is a paradigm that can do this -- the relational paradigm.


The couchDB approach suggest to emit proper classes of stuff in map phase and summarize it in reduce.. So you could map all comments and emit 1 for the given user and later print out only ones. It would require however lots of disk storage to build persistent views of all trackable data in couchDB. btw they have also this wiki page about relationships: http://wiki.apache.org/couchdb/EntityRelationship.

Riak on the other hand has tool to build relations. It is link. You can input address of a linked (here comment) document to the 'root' document (here user document). It has one trick. If it is distributed it may be modified at one time in many locations. It will cause conflicts and as a result huge vector clock tree :/ ..not so bad, not so good.

Riak has also yet another 'mechanism'. It has 2-layer key name space, so called bucket and key. So, for student example, If we have club A, B and C and student StudentX, StudentY you could maintain following convention:

{ Key = {ClubA, StudentX}, Value = true }, 
{ Key = {ClubB, StudentX}, Value = true }, 
{ Key = {ClubA, StudentY}, Value = true }

and to read relation just list keys in given buckets. Whats wrong with that? It is damn slow. Listing buckets was never priority for riak. It is getting better and better tho. btw. you do not waste memory because this example {true} can be linked to single full profile of StudentX or Y (here conflicts are not possible).

As you see it NoSQL != NoSQL. You need to look at specific implementation and test it for yourself.

Mentioned before Column stores look like good fit for relations.. but it all depends on your A and C and P needs;) If you do not need A and you have less than Peta bytes just leave it, go ahead with MySql or Postgres.

good luck


  1. user:userid:comments is a reasonable approach - think of it as the equivalent of a column index in SQL, with the added requirement that you cannot query on unindexed columns.

  2. This is where you need to think about your requirements. A list with 30 million items is not unreasonable because it is slow, but because it is impractical to ever do anything with it. If your real requirement is to display some recent comments you are better off keeping a very short list that gets updated whenever a comment is added - remember that NoSQL has no normalization requirement. Race conditions are an issue with lists in a basic key value store but generally either your platform supports lists properly, you can do something with locks, or you don't actually care about failed updates.

  3. Same as for user comments - create an index keyword:posts

  4. More of the same - probably a list of clubs as a property of student and an index on that field to get all members of a club


You have

"user": {
    "userid": "unique value",
    "category": "student",
    "metainfo": "yada yada yada",
    "clubs": ["archery", "kendo"]
}

"comments": {
    "commentid": "unique value",
    "pageid": "unique value",
    "post-time": "ISO Date",
    "userid": "OP id -> THIS IS IMPORTANT"
}

"page": {
    "pageid": "unique value",
    "post-time": "ISO Date",
    "op-id": "user id",
    "tag": ["abc", "zxcv", "qwer"]
}

Well in a relational database the normal thing to do would be in a one-to-many relation is to normalize the data. That is the same thing you would do in a NoSQL database as well. Simply index the fields which you will be fetching the information with.

For example, the important indexes for you are

  • Comment.UserID
  • Comment.PageID
  • Comment.PostTime
  • Page.Tag[]

If you are using NosDB (A .NET based NoSQL Database with SQL support) your queries will be like

 SELECT * FROM Comments WHERE userid = ‘That user’;

 SELECT * FROM Comments WHERE pageid = ‘That user’;

 SELECT * FROM Comments WHERE post-time > DateTime('2016, 1, 1');

 SELECT * FROM Page WHERE tag = 'kendo'

Check all the supported query types from their SQL cheat sheet or documentation.