Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NoSQL / RDBMS hybrid with referential integrity (delete cascade)?

Is there a database out there that gives you the benefit of referential integrity and being able to use a SQL type language for querying, but also lets entities be loosely defined with respect to their data attributes and also the relationships between them?

E.g. take a RBAC type model where you have Permissions, Users, User Groups & Roles. A complex/flexible model could have the following rules:

  • Roles can have one or more permissions and a permission can belong to one or more Roles
  • Users can have one or more permissions and a permission can belong to one or more Users
  • Users Groups can have one or more permissions and a permission can belong to one or more Users Groups
  • Users can have one or more roles and a role can belong to one or more Users
  • User Groups can have one or more roles and a role can belong to one or more User Groups
  • Roles can have one or more roles and a role can belong to one or more Roles

To model the above in an RDBMS would involve the creation of lots of intersection tables. Ideally, all I'd like to define in the database is the entities themselves (User, Role, etc) plus some mandatory attributes. Everything else would then be dynamic (i.e. no DDL required), e.g. I could create a User with a new attribute which wasn't pre-defined. I could also create a relationship between entities that hasn't been predefined, though the database would handle referential integrity like a normal RDBMS.

The above can be achieved to some degree in a RDBMS by creating a table to store entities and another one to store relationships etc, but this overly complicates the SQL needed to perform simple queries and may also have performance implications.

like image 267
SeriousCallersOnly Avatar asked Aug 02 '10 13:08

SeriousCallersOnly


3 Answers

Most NoSQL databases are built to scale very well. This is done at the cost of consistency, of which referential integrity is part of. So most NoSQL don't support any type of relational constraints.

There's one type of NoSQL database that does support relations. In fact, it's designed especially for relations: the graph database. Graph databases store nodes and explicit relations (edges) between these nodes. Both nodes and edges can contain data in the form of key/value pairs, without being tied to a predefined schema.

Graph databases are optimized for relational queries and nifty graph operations, such as finding the shortest path between two nodes, or finding all nodes within a given distance from the current node. You wouldn't need this in a role/permission scenario, but if you do, it'll be a lot harder to achieve using an RDBMS.

Another option is to make your entire data layer a hybrid, by using a RDBMS to store the relations and a document database to store the actual data. This would complicate your application slightly, but I don't think it's such a bad solution. You'll be using two different technologies, both dealing with the problems they were designed to deal with.

like image 97
Niels van der Rest Avatar answered Oct 16 '22 10:10

Niels van der Rest


Given the requirements you specify in your question, a graph database is probably the sort of thing you are looking for, but there are other options. As @Niels van der Rest said, the two constraints of "no a priori schema" and "referential integrity" are very hard to reconcile. You might be able to find a Topic-Map based database that might do so, but I'm not familiar with specific implementations so I couldn't say for sure.

If you decide you really can't do without referential integrity, I fear you probably are stuck with an RDBMS. There are some tricks you can use that might avoid some of the problems you anticipate, I cover a couple in https://stackoverflow.com/questions/3395606..., which might give you some ideas. Still, for this sort of data-model requiring dynamic, post-priori schema, with meta-schema elements, an RDBMS is always going to be awkward.

If you are willing to forgo referential integrity, then you still have three approaches to consider.

  1. Map/Reduce - in two flavours: distributed record-oriented (think, MongoDB), and column-oriented (think, Cassandra). Scales really really well, but you won't have your SQL-like syntax; joins suck; and matching your architecture to your specific query types is critical. In your case your focus on the entities and their attributes, rather than the relationships between the entities themselves, so I would probably consider a distributed record-oriented store; but only if I expected to need to scale beyond a single node—they do scale really really well.

  2. Document-store - technically in two flavours, but one of them is a distributed record-oriented map/reduce datastore discussed above. The other is an inverted-index (think, Lucene/Solr). Do NOT disregard the power of an inverted-index; they can resolve obscenely complex record predicates amazingly fast. What they can't do is handle well is queries that include correlation or large relational joins. Still, you will be amazed at the incredible flexibility, sufficiently complex record predicates gives you.

  3. Graph-store - come in a few flavours the first is the large-scale, ad-hoc key-value store (think, DBM/TokyoTyrant); the second is the tuple-space (think, Neo4j); the third is the RDF database (think, Sesame/Mulgara). I have a soft-spot for RDF, having helped develop mulgara, so I am not the most objective commenter. Still, if your scalability constraints will permit you to use an RDF-store, I find the inferencing permitted by RDF's denotational semantics (rare amongst noSQL datastore options) invaluable.

like image 10
Recurse Avatar answered Oct 16 '22 10:10

Recurse


Some NoSQL solutions support security and SQL. One of these is OrientDB. The security system is (quite) well explained here.

Furthermore supports SQL.

like image 7
Lvca Avatar answered Oct 16 '22 09:10

Lvca