Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the disadvantages of using Event sourcing and CQRS?

Event sourcing and CQRS is great because it gets rids developers being stuck with one pre-modelled database which the developer has to work with for the lifetime of the application unless there is a big data migration project. CQRS and ES also has other benefits like scaling eventstore, audit log etc. that are already all over the internet.

But what are the disadvantages ?

Here are some disadvantages that I can think of after researching and writing small demo apps

  1. Complex: Some people say ES is complex. But I'd say having a complex application is better than a complex database model on which you can only run very restricted queries using a query language (multiple joins, indexes etc). I mean some programming languages like Scala have very rich collection library that is very flexible to produce some seriously complex aggregations and also there is Apache Spark which makes it easy query distributed collections. But databases will always be restricted to it's query language capabilities and distributing databases are harder then distributed application code (Just deploy another instance on another machine!).
  2. High disk space usage: Event store might end up using a lot of disk space to store events. But we can schedule a clean up every few weeks and creating snapshot and may be we can store historical events locally on an external HD just incase we need old events in the future ?
  3. High memory usage: State of every domain object is stored in memory which might increase RAM usage and we all how expensive RAM is. BIG PROBLEM!! because I'm poor! any solution to this ? May be use Sqlite instead of storing state in memory ? Am I making things more complex by introducing multiple Sqlite instances in my application ?
  4. Longer bootup time: On failure or software upgrade bootup is slow depending on the number of events. But we can use snapshots to solve this ?
  5. Eventual consistency: Problem for some applications. Imagine if Facebook used Event sourcing with CQRS for storing posts and considering how busy facebook's system is and if I posted a post I would see my fb post the next day :)
  6. Serialized events in Event store: Event stores store events as serialized objects which means we can't query the content of events in the event store which is discouraged anyway. And we won't be able to add another attribute to the event in the future. Solution would be to store events as JSON objects instead of Serialized events ? But is that a good idea ? Or add more Events to support the change to orignal event object ?

Can someone please comment on the disadvantages I brought up here and correct me if I am wrong and suggest any other I may have missed out ?

like image 953
user794783 Avatar asked Oct 22 '15 11:10

user794783


People also ask

What is the difference between Event Sourcing and CQRS?

CQRS is implemented by a separation of responsibilities between commands and queries, and event sourcing is implemented by using the sequence of events to track changes in data.

What benefits do we get when we use Event Sourcing with CQRS?

Event sourcing is a powerful pattern and brings a lot of benefits to an application architecture if used appropriately: Makes write operations much faster as there is no read, update, and write required; write is merely appending an event to a log.

What is the difference between event driven and Event Sourcing?

Event Sourcing is keeping a log for your own use so you don't forget. Event Driven Architecture is about communicating what happened to others. Typically, components in EDA can't recover the entirety of their state from the events they've published because not everything that changed their state is worth publishing.


3 Answers

Here is my take on this.

  1. CQRS + ES can make things a lot simpler in complex software systems by having rich domain objects, simple data models, history tracking, more visibility into concurrency problems, scalability and much more. It does require a different way thinking about the systems so it could be difficult to find qualified developers. But CQRS makes it simpler to separate responsibilities across developers. For example, a junior developer can work purely with the read side without having to touch business logic.

  2. Copies of data will require more disk space for sure. But storage is relatively cheap these days. It may require the IT support team to do more backups and planning how to restore the system in a case in things go wrong. However, server virtualization these days makes it a more streamlined workflow. Also, it is much easier to create redundancy in the system without a monolithic database.

  3. I do not consider higher memory usage a problem. Business object hydration should be done on demand. Objects should not keep references to events that have already been persisted. And event hydration should happen only when persisting data. On the read side you do not have Entity -> DTO -> ViewModel conversions that usually happened in tiered systems, and you would not have any kind of object change tracking that full featured ORMs usually do. Most systems perform significantly more reads than writes.

  4. Longer boot up time can be a slight problem if you are using multiple heterogeneous databases due to initialization of various data contexts. However, if you are using something simple like ADO .NET to interact with the event store and a micro-ORM for the read side, the system will "cold start" faster than any full featured ORM. The important thing here is not to over-complicate how you access the data. That is actually a problem CQRS is supposed to solve. And as I said before, the read side should be modeled for the views and not have any overhead of re-mapping data.

  5. Two-phase commit can work well for systems that do not need to scale for thousands of users in my experience. You would need to choose databases that would work well with the distributed transaction coordinator. PostgreSQL can work well for read and write separate models, for example. If the system needs to scale for a high number of concurrent users, it would have to be designed with eventual consistency in mind. There are cases where you would have aggregate roots or context boundaries that do not use CQRS to avoid eventual consistency. It makes sense for non-collaborative parts of the domain.

  6. You can query events in serialized a format like JSON or XML, if you choose the right database for the event store. And that should be only done for purposes of analytics. Nothing inside the system should query event store by anything other than the aggregate root id and the event type. That data would be indexed and live outside the serialized event.

like image 188
Dmitry S. Avatar answered Oct 23 '22 15:10

Dmitry S.


Just to comment on point 5. I've been told that Facebook does use ES with Eventual Consistency, which is why you can sometimes see a post disappear and reappear after you've posted it.

Usually the read-model your browser is accessing is located 'close' to you, but after you make a post the SPA switches over to a read-model that is close to your write-model. The close proximity between the write-model (events) and the read-model mean you get to see your own post.

However, 15 minutes later your SPA switches back to the first, closer, read-model. If the event containing your post hasn't yet propagated to that read-model you'll see your own post disappear only to reappear sometime later.

like image 44
Sam Stickland Avatar answered Oct 23 '22 17:10

Sam Stickland


I know it's been almost 3 years since this question was asked, but still this article may be useful for someone. Key points are

  • Scaling with snapshots
  • Visibility of data
  • Schema changing
  • Dealing with complex domains
  • Need to explain it to most new team members
like image 9
kagetoki Avatar answered Oct 23 '22 17:10

kagetoki