Tl;dr: "How can I push a message through a bunch of asynchronous, unordered microservices and know when that message has made it through each of them?" I'm struggling to find the right messaging system/protocol for a specific microservices architecture. This isn't a "which is best" question, but a question about what my options are for a design pattern/protocol. <img src="https://i.stack.imgur.com/MbHRf.png" alt="Diagram"> <ul> <li>I have a message on the beginning queue. Let's say a RabbitMQ message with serialized JSON</li> <li>I need that message to go through an arbitrary number of microservices</li> <li>Each of those microservices are long running, must be independent, and may be implemented in a variety of languages</li> <li>The order of services the message goes through does not matter. In fact, it should not be synchronous.</li> <li>Each service can append data to the original message, but that data is ignored by the other services. There should be no merge conflicts (each service writes a unique key). No service will change or destroy data.</li> <li>Once all the services have had their turn, the message should be published to a second RabbitMQ queue with the original data and the new data.</li> <li>The microservices will have no other side-effects. If this were all in one monolithic application (and in the same language), functional programming would be perfect.</li> </ul> So, the question is, what is an appropriate way to manage that message through the various services? I don't want to have to do one at a time, and the order isn't important. But, if that's the case, how can the system know when all the services have had their whack and the final message can be written onto the ending queue (to have the next batch of services have their go). The only, semi-elegant solution I could come up with was <ol> <li>to have the first service that encounters a message write that message to common storage (say mongodb)</li> <li>Have each service do its thing, mark that it has completed for that message, and then check to see if all the services have had their turn</li> <li>If so, that last service would publish the message</li> </ol> But that still requires each service to be aware of all the other services and requires each service to leave its mark. Neither of those is desired. I am open to a "Shepherd" service of some kind. I would appreciate any options that I have missed, and am willing to concede that their may be a better, fundamental design. Thank you.

There are two methods of managing a long running process (or a processing involving multiple microservices): Orchestration and choreography. There are a lot of articles describing them. Long story short: In Orchestration you have a microservice that keeps track of the process status and in Choreography all the microservices know where to send next the message and/or when the process is done. This article explains the benefits and tradeofs of the two styles. Orchestration <img src="https://i.stack.imgur.com/iIHkU.jpg" alt="Orchestration"> Orchestration Benefits <ul> <li>Provides a good way for controlling the flow of the application when there is synchronous processing. For example, if Service A needs to complete successfully before Service B is invoked.</li> </ul> Orchestration Tradeoffs <ul> <li>Couples the services together creating dependencies. If service A is down, service B and C will never be called.</li> <li>If there is a central shared instance of the orchestrator for all requests, then the orchestrator is a single point of failure. If it goes down, all processing stops.</li> <li>Leverages synchronous processing that blocks requests. In this example, the total end-to-end processing time is the sum of time it takes for Service A + Service B + Service C to be called.</li> </ul> Choreography <img src="https://i.stack.imgur.com/ewmRu.jpg" alt="Choreography"> Choreography Benefits <ul> <li>Enables faster end-to-end processing as services can be executed in parallel/asynchronously.</li> <li>Easier to add/update services as they can be plugged in/out of the event stream easily.</li> <li>Aligns well with an agile delivery model as teams can focus on particular services instead of the entire application.</li> <li>Control is distributed, so there is no longer a single orchestrator serving as a central point of failure.</li> <li>Several patterns can be used with a reactive architecture to provide additional benefits. For example, Event Sourcing is when the Event Stream stores all of the events and enables event replay. This way, if a service went down while events were still being produced, when it came back online it could replay those events to catch back up. Also, Command Query Responsibility Segregation (CQRS) can be applied to separate out the read and write activities. This enables each of these to be scaled independently. This comes in handy if you have an application that is read-heavy and light on writes or vice versa.</li> </ul> Choreography Tradeoffs <ul> <li>Async programming is often a significant mindshift for developers. I tend to think of it as similar to recursion, where you can’t figure out how code will execute by just looking at it, you have to think through all of the possibilities that could be true at a particular point in time.</li> <li>Complexity is shifted. Instead of having the flow control centralized in the orchestrator, the flow control is now broken up and distributed across the individual services. Each service would have its own flow logic, and this logic would identify when and how it should react based on specific data in the event stream.</li> </ul>

Microservice architecture - carry message through services when order doesn't matter

Tags:

architecture

design-patterns

rabbitmq

message-queue

microservices

Tl;dr: "How can I push a message through a bunch of asynchronous, unordered microservices and know when that message has made it through each of them?"

I'm struggling to find the right messaging system/protocol for a specific microservices architecture. This isn't a "which is best" question, but a question about what my options are for a design pattern/protocol.

Diagram

I have a message on the beginning queue. Let's say a RabbitMQ message with serialized JSON
I need that message to go through an arbitrary number of microservices
Each of those microservices are long running, must be independent, and may be implemented in a variety of languages
The order of services the message goes through does not matter. In fact, it should not be synchronous.
Each service can append data to the original message, but that data is ignored by the other services. There should be no merge conflicts (each service writes a unique key). No service will change or destroy data.
Once all the services have had their turn, the message should be published to a second RabbitMQ queue with the original data and the new data.
The microservices will have no other side-effects. If this were all in one monolithic application (and in the same language), functional programming would be perfect.

So, the question is, what is an appropriate way to manage that message through the various services? I don't want to have to do one at a time, and the order isn't important. But, if that's the case, how can the system know when all the services have had their whack and the final message can be written onto the ending queue (to have the next batch of services have their go).

The only, semi-elegant solution I could come up with was

to have the first service that encounters a message write that message to common storage (say mongodb)
Have each service do its thing, mark that it has completed for that message, and then check to see if all the services have had their turn
If so, that last service would publish the message

But that still requires each service to be aware of all the other services and requires each service to leave its mark. Neither of those is desired.

I am open to a "Shepherd" service of some kind.

I would appreciate any options that I have missed, and am willing to concede that their may be a better, fundamental design.

Thank you.

274

asked Dec 21 '17 05:12

Apollo

2 Answers

There are two methods of managing a long running process (or a processing involving multiple microservices): Orchestration and choreography. There are a lot of articles describing them.

Long story short: In Orchestration you have a microservice that keeps track of the process status and in Choreography all the microservices know where to send next the message and/or when the process is done.

This article explains the benefits and tradeofs of the two styles.

Orchestration

Orchestration Benefits

Provides a good way for controlling the flow of the application when there is synchronous processing. For example, if Service A needs to complete successfully before Service B is invoked.

Orchestration Tradeoffs

Couples the services together creating dependencies. If service A is down, service B and C will never be called.
If there is a central shared instance of the orchestrator for all requests, then the orchestrator is a single point of failure. If it goes down, all processing stops.
Leverages synchronous processing that blocks requests. In this example, the total end-to-end processing time is the sum of time it takes for Service A + Service B + Service C to be called.

Choreography

Choreography Benefits

Enables faster end-to-end processing as services can be executed in parallel/asynchronously.
Easier to add/update services as they can be plugged in/out of the event stream easily.
Aligns well with an agile delivery model as teams can focus on particular services instead of the entire application.
Control is distributed, so there is no longer a single orchestrator serving as a central point of failure.
Several patterns can be used with a reactive architecture to provide additional benefits. For example, Event Sourcing is when the Event Stream stores all of the events and enables event replay. This way, if a service went down while events were still being produced, when it came back online it could replay those events to catch back up. Also, Command Query Responsibility Segregation (CQRS) can be applied to separate out the read and write activities. This enables each of these to be scaled independently. This comes in handy if you have an application that is read-heavy and light on writes or vice versa.

Choreography Tradeoffs

Async programming is often a significant mindshift for developers. I tend to think of it as similar to recursion, where you can’t figure out how code will execute by just looking at it, you have to think through all of the possibilities that could be true at a particular point in time.
Complexity is shifted. Instead of having the flow control centralized in the orchestrator, the flow control is now broken up and distributed across the individual services. Each service would have its own flow logic, and this logic would identify when and how it should react based on specific data in the event stream.

174

answered Oct 16 '22 14:10

Constantin Galbenu

I would go along the common storage idea.

Have each microservice register itself with the common storage. Have each microservice register it has processed the message identifier when it does.

You can work out which n services should process it and how many of the n service have processed it.

No services need to be aware of each other.

answered Oct 16 '22 15:10

Jonno

Related questions
                            
                                Best websites for design patterns? [closed]
                            
                                Select / Insert version of an Upsert: is there a design pattern for high concurrency?
                            
                                Prototype Pattern in Java - the clone() method
                            
                                Singleton class in a static library
                            
                                Singleton Inheritance
                            
                                Exactly how slow are C, F, L, l and M of PatternLayout (log4j)?
                            
                                C# Dto constructor and dependency injection
                            
                                MVC repository pattern design decision
                            
                                Which are common DDD (Domain-Driven Design) patterns?
                            
                                What is System.Lazy<T> and the Singleton Design Pattern
                            
                                Difference between pool and cluster
                            
                                Design pattern for checking collision between shapes
                            
                                Should I stop using abstract base classes/interfaces and instead use boost::function/std::function?
                            
                                Framework / Design pattern for business rule validation [closed]
                            
                                What is the best method for making database connection (static, abstract, per request, ...)?
                            
                                Writing my own implementation of stl-like Iterator in C++
                            
                                Design pattern for large decision tree based AI in c++
                            
                                Python - what is the correct way to copy an object's attributes over to another?
                            
                                Is caching a repository, domain or application concern?
                            
                                Why does swift provide both a CGRect initializer and a CGRectMake function?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With