We have microservices which work with different, but related data. For example, ads and their stats. We want to be able to filter, sort and aggregate this related data for UI(and not only for it). For example, we want to show to a user ads which have 'car' in their text and which have more than 100 clicks.
Challenges:
Requirements:
Solutions we could think of:
What should we pay attention to? Are there other ways to solve our problem?
Create a single database for different microservices is anti-pattern, then the correct way is to create a database for each microservice.
7.1. One way to implement query operations, such as findOrder() , that retrieve data owned by multiple services is to use the API composition pattern. This pattern implements a query operation by simply invoking the services that own the data and combining the results. Figure 7.2 shows the structure of this pattern.
Yes, it's possible to integrate a database for microservices. You can create a single shared database with each service accessing data using local ACID transactions.
Microservices guidelines strongly recommend you to use the Single Repository Principle(SRP), which means each microservice maintains its own database and no other service should access the other service's database directly. There is no direct and simple way of maintaining ACID principles across multiple databases.
You could use CQRS. In this low level architecture, the model use for writing data is split from the model use to read/query data. The write model is the canonical source of information, is the source of truth.
The write model publishes events that are interpreted/projected by one or more read models, in an eventually consistent manner. Those events could be even published in a message queue and consumed by external read models (other microservices). There is no 1:1 mapping from write to read. You can have 1 model for write and 3 models for read. Each read model is optimized for its use-case. This is the part that interests you: an speed-optimized read model.
An optimized read model has every thing it needs when it answers the queries. The data is fully denormalized (this means it needs no joins) and already indexed.
A read model can have its data sharded. You do this in order to minimize the collection size (a small collection is faster than a bigger one). In your case, you could shard by user: each user would have its own collection of statistics (i.e. a table in SQL or a document collection in NoSQL). You can use the build-in sharding of the database or you could shard it manually, by splitting in separate collections (tables).
Services doesn't have all the data.
A read model could subscribe to many sources of truth (i.e. microservices or event streams).
One particular case that works very well with CQRS is Event sourcing; it has the advantage that you have the events from the begging of time, without the need to store them in a persistent message queue.
P.S. I could not think about a use-case when a read model could not be made fast enough, given enough hardware resources.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With