How to filter and sort data from multiple microservices?

Tags:

We have microservices which work with different, but related data. For example, ads and their stats. We want to be able to filter, sort and aggregate this related data for UI(and not only for it). For example, we want to show to a user ads which have 'car' in their text and which have more than 100 clicks.

Challenges:

There could be a lot of data. Some users have millions of rows after filtration
Services doesn't have all the data. For example, for statistics service ad without stats == non existent ad. It doesn't know anything about such ads. But sorting and filtration should work anyway(ad without stats should be considered as ad without zero clicks)

Requirements:

Eventual consistency within couple of seconds is OK
Data loss is not acceptable
5 to 10 seconds filtration and sorting for big clients with millions of rows is OK

Solutions we could think of:

Load all data required by query from all services and filter and sort it in memory.
Push updates from services to Elasticsearch(or something like this). Elastic handles query and returns ids of desired entities which then loaded from services.
One big database for all services which has everything

What should we pay attention to? Are there other ways to solve our problem?

249

asked Jan 26 '18 09:01

Artem Malinko

1 Answers

You could use CQRS. In this low level architecture, the model use for writing data is split from the model use to read/query data. The write model is the canonical source of information, is the source of truth.

The write model publishes events that are interpreted/projected by one or more read models, in an eventually consistent manner. Those events could be even published in a message queue and consumed by external read models (other microservices). There is no 1:1 mapping from write to read. You can have 1 model for write and 3 models for read. Each read model is optimized for its use-case. This is the part that interests you: an speed-optimized read model.

An optimized read model has every thing it needs when it answers the queries. The data is fully denormalized (this means it needs no joins) and already indexed.

A read model can have its data sharded. You do this in order to minimize the collection size (a small collection is faster than a bigger one). In your case, you could shard by user: each user would have its own collection of statistics (i.e. a table in SQL or a document collection in NoSQL). You can use the build-in sharding of the database or you could shard it manually, by splitting in separate collections (tables).

Services doesn't have all the data.

A read model could subscribe to many sources of truth (i.e. microservices or event streams).

One particular case that works very well with CQRS is Event sourcing; it has the advantage that you have the events from the begging of time, without the need to store them in a persistent message queue.

P.S. I could not think about a use-case when a read model could not be made fast enough, given enough hardware resources.

173

answered Oct 24 '22 05:10

Constantin Galbenu

Related questions
                            
                                What is pipelining? how does it increase the speed of execution?
                            
                                Rotting design and viscosity
                            
                                Abstract Base Class vs. Concrete Class as a SuperType
                            
                                What should come first -- the design pattern or the code?
                            
                                Get result of executed method in Command Pattern
                            
                                xcode 5.1 - Undefined symbols for architecture x86_64 (zbar)
                            
                                Are flat file databases any good? [closed]
                            
                                Getting to an application architect level
                            
                                Execute Set of ValidationRule-C# Class Design - Better Approach
                            
                                Where do operations on models belong in Application Design Patterns?
                            
                                How to find the OS bit type [duplicate]
                            
                                How have you used IContainer/ISite/IComponent in your own code? [duplicate]
                            
                                Db design for data update approval
                            
                                Is there a controller in SwiftUI?
                            
                                Implementing untrusted plugins in .NET web application
                            
                                Multiple sites on Django
                            
                                Create "Engine" to allow integrations to main web application?
                            
                                Room entities and domain models, should they be separate?
                            
                                Do I really need to pass around Context instances deep into the application?
                            
                                Announcing your app from within a container (docker)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to filter and sort data from multiple microservices?

Tags:

architecture

microservices

Artem Malinko

People also ask

1 Answers

Constantin Galbenu

Recent Activity

Donate For Us