Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DDD/CQRS/ES Implement aggregate member using graph database aka using an immediately consistent readModel as entity collection

Abstract

I am modelling a generic authorization subdomain for my application. The requirements are quite complicated as it needs to cope with multi tenants, hierarchical organisation structure, resource groups, user groups, permissions, user-editable permissions and so on. It's a mixture of RBAC (users assigned to roles, roles having permissions, permissions can execute commands) with claims-based auth.

Problem

When checking for business rule invariants, I have to traverse the permission "graph" to find a permission for a user to execute a command on a resource in an environment. The traversal depth is arbitrary, on multiple dimensions.

I could model this using code, but it would be best represented using a graph database as queries/updates on this aggregate would be faster. Also, it would reduce the complexity of the code itself. But this would require the graph database to be immediately consistent.

Still, I need to use CQRS/ES, and enable a distributed architecture.

So the graph database needs to be

  • Immediately consistent

And this introduces some drawbacks

  • When loading events from event-store, we have to reconstruct the graph database each time
  • Or, we have to introduce some kind of graph database snapshotting
  • Overhead when communicating with the graph database

But it has advantages

  • Reduced complexity of performing complex queries
  • Complex queries are resolved faster than with code
  • The graph database is perfect for this job

Why this question?

In other aggregates I modelled, I often have a EntityList instance or EntityHierarchy instance. They basically are ordered/hierarchical collection of sub-entities. Their implementation is arbitrary. They can support anything from indexing, key-value pairs, dynamic arrays, etc. As long as they implement the interfaces I declared for them. I often even have methods like findById() or findByName() on those entities (lists). Those methods are similar to methods that could be executed on a database, but they're executed in-memory.

Thus, why not have an implementation of such a list that could be bound to a database? For example, instead of having a TMemoryEntityList, I would have a TMySQLEntityList. In the case at hand, perhaps having an implementation of a TGraphAuthorizationScheme that would live inside a TOrgAuthPolicy aggregate would be desirable. As long as it behaves like a collection and that it's iterable and support the defined interfaces.

I'm building my application with JavaScript on Node.js. There is an in-memory implementation of this called LevelGraph. Maybe I could use that as well. But let's continue.

Proposal

I know that in DDD terms the infrastructure should not leak into the domain. That's what I'm trying to prevent. That's also one of the reasons I asked this question, is that it's the first time I encounter such a technical need, and I am asking people who are used to cope with this kind of problem for some advice.

The interface for the collection is IAuthorizationScheme. The implementation has to support deep traversal, authorization finding, etc. This is the interface I am thinking about implementing by supporting it with a graph database.

Sequence :

1 When a user asks to execute a command I first authenticate him. I find his organisation, and ask the OrgAuthPolicyRepository to load up his organisation's corresponding OrgAuthPolicy.

  1. The OrgAuthPolicyRepository loads the events from the EventStore.

  2. The OrgAuthPolicyRepository creates a new OrgAuthPolicy, with a dependency-injected TGraphAuthorizationScheme instance.

  3. The OrgAuthPolicyRepository applies all previous events to the OrgAuthPolicy, which in turns call queries on the graph database to sync states of the GraphDatabase with the aggregate.

  4. The command handler executes the business rule validation checks. Some of them might include checks with the aggregate's IAuthorizationScheme.

  5. The business rules have been validated, and a domain event is dispatched.

  6. The aggregate handles this event, and applies it to itself. This might include changes to the IAuthorizationScheme.

  7. The eventBus dispatched the event to all listening eventHandlers on the read-side.

Example :

enter image description here

In resume

Is it conceivable/desirable to implement entities using external databases (ex. Graph Database) so that their implementation be easier? If yes, are there examples of such implementation, or guidelines? If not, what are the drawbacks of using such a technique?

like image 663
Ludovic C Avatar asked Nov 10 '15 15:11

Ludovic C


1 Answers

To solve your task I would consider the following variants going from top to bottom:

  1. Reduce task complexity by employing security frameworks or identity management solutions. Some existent out of the box identity management solution might do the job. If it doesn't take a look on the frameworks to help you implement your own. Unfortunately I'm poorly familiar with Node.js world to advice you any. In Java world that could be Apache Shiro or Spring Security. This could be a good option from both costs and security perspective
  2. Maintain single model instead of CQRS. This eliminates consistency problems (if you will decide to have separate resources to store your models). From my understanding permissions should not be changed frequently but they will be accessed frequently. This means you can live with one model optimised for reads, avoiding consistency issues and maintaining 2 models. To track down user behaviour you can implement auditing separately. From my experience security auditing can require some additional data which most likely is not in your data model.
  3. Do it with CQRS. And here I would first consider revisit requirements to find a way to accept eventual consistency instead of strong consistency. This opens many options for implementation.

Regarding the question should you use introduce dedicated Graph Database it's impossible to answer without knowledge of your domain, budget, desired system throughput and performance, existent infrastructure, team knowledge and setup etc. You need to estimate costs of the solution with dedicated Graph Database and without it. My filling is that unless permission management is main idea of your project or your project is mature enough (by number of users and R&D capacities) dedicated database is unlikely to pay back it's costs for your task.

To understand what could be benefits of having dedicated Graph Database your existent storage solutions should be taken in opposite. These 2 articles explains pretty well what could be such benefits:

  • http://neo4j.com/developer/graph-db-vs-nosql/
  • http://neo4j.com/developer/graph-db-vs-rdbms/
like image 124
Grygoriy Gonchar Avatar answered Nov 04 '22 09:11

Grygoriy Gonchar