I am going through the research stage for a new project and I am currently having a debate with a work colleague about the architecture of the project.
We have agreed that we will create a distributed messaging system using CQRS and Event sourcing with azure. It will be an SPA with the front end using angular js and the backend will be Web API.
We have now discussed how the database will be set up and this is where the differences has occurred.
We have agreed that we will split the database into two databases, one for the read and write. My work colleague wants to use sql server for both the read and write databases as he has spent his whole career in sql and doesn't want to hear anything else but sql. I on the other hand have been researching about NoSql and I feel it would be suitable for the read database as it is better for performance.
As CQRS is all about eventual consistency I have read that NoSql databases are based on this as well which is now making me think about using NoSql for the write database as well.
We also planned on creating an events table for each aggregate root instead of one generic events table that contains all the events. As these tables are not relational it made me think why we should use sql server.
My questions are more of a what is best practice or general approach to how people create their event stores.
Most of all, microservices are built for speed. NoSQL enables most services to get set up quickly, scale-out fast, and create additional data nodes where you can test without touching the persistence layer.
Deploying databases at scale in a way that supports microservices is often easier with NoSQL databases. NoSQL databases often have superior integration with real-time streaming technologies.
On the technical level, event sourcing can be implemented using dedicated storage systems, as well as general-purpose "NoSQL" and SQL databases.
Install MongoDB and run. Install SymmetricDS Pro and run. Create a new master node through the SymmetricDS Pro web console and connect it to your source relational database of choice. In this example I used Microsoft SQL Server.
Note: My answer below is from 2014. Times have changed, and several NoSQL databases support multi-document transactions now. I'm still advocating PostgreSQL for storing events in particular because of its awesome JSON support.
The last question is primarily opinion based, so let answer the first two questions from my perspective.
Before doing so, let me say that were are using an SQL server (Postgres) for our event store. With CQRS+ES it is trivial to have one primary master for the writes to the database, have the repository read from the slaves (not to be confused with the CQRS reads). Scalability should not be a problem in this setup (if Reddit can scale with Postgres in master/slave, so can you).
I have done an extensive evaluation of the various options for NoSQL stores. At the end of the day, for an CQRS+ES setup I could not find any advantages over the SQL approach in master/slave mode that, IMO, would outweigh the disadvantages you'll get with a NoSQL databases.
In particular, it is trivial to implement optimistic locking in an SQL store (unique constraint on streamId and version fields), but it is quite hard (impossible?) to achieve reliable optimistic locking in simple NoSQL key-value stores.
With some document stores it is possible (I think there is a reason why NEventStore supports only RavenDB and MongoDB.):
In Postgres, I'm using one single event table for all events.
As for the second question, I think you have to distinguish between the event store and the read model. We are using an SQL server for the event store (for the reasons mentioned above), but for the CQRS read model I think NoSQL "databases" are a great fit, as the eventual consistency approach fits very well with the NoSQL paradigm. Also, one usually queries the read side by key only anyways, so you can use any key-value store that you like and fits your needs. We are not even using a database at all, but an in-memory grid that is re-built on demand.
Personally, I prefer one aggregate per table, but this depends case by case. Even when using one table for all aggregates, partition tech could be used to alleviate potential performance issue.
I think NoSQL is feasible for both read and write side, but SQL database does provide an practical feature: Transactionality. This is especially useful when you need to commit several events in one transaction. For NoSQL database, you may have to change your schema to achieve this. For example, using per transaction per document when adopting Mongo. Or you could use Cassandra and make every row as an aggregate (one event per column, Cassandra provides partly transactionality for row level).
But this is not a big advantage if your program publishes only one event per transaction or Consistency is not a strong requirement.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With