Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Processing a message exactly once

Tags:

jms

messaging

esb

Please consider the scenario as shown in the attached image :

Multiple instances of multiple apps.

  1. The Portal(producer) will send some message to the bus to which has to be processed by multiple applications(consumer) – PAYROLLAPP, HELPDESK etc.
  2. Multiple instances of consumer applications may be running, also these instances can be added/ removed dynamically
  3. Now, it is critical to ensure that message is processed only once, per application i.e if PAYROLLAPP -1 processes the message, PAYROLLAPP -2 should NOT process it; of course, in the above diagram, HELPDESK – 1 must process it. In short, in case of multiple instances, exactly one must process the message, once
  4. When I searched for answers, most of the stuff was about creating a 'selective consumer' - a consumer that accepts/rejects a message based on some logic - please note that no changes/additions/wrapping can be done for the applications shown in the diagram; the logic has to reside somewhere in the provider that manages the bus

Please guide about the same.

Adding more details after Petter's answer :

My current understanding and approaches

  1. The items to the to the left of the left-dotted line are the 'approaches' - Pure JMS,ESB,EAI
  2. The items to the to the right of the right-dotted line are the 'implementations'

Now, the big part - QUERIES :

  1. Irrespective of the solution(pure JMS, ESB, EAI), does the part below the horizontal dotted line(application-specific queues) needs to be implemented?
  2. How does the usage of ESB(JBoss ESB etc.), instead of ‘pure’ JMS(Active MQ etc.), help/ hamper? Does ESB provide any advantage over JMS which is ‘java-only’(?). I am hell confused – ‘ESB or JMS’, even after referring threads like these : JMS and ESB - how they are related?. It has one reply which says “JMS is not well suited for the integration of REST services, File systems, S/FTP, Email, Hessian, SOAP etc. which are better handled with an ESB that supports these types natively. For example, if you have a process that dumps a CSV file of 500MB at midnight, and you want another system to pickup the file, parse CSV and import into a database, this can easily be accomplished by an ESB - whereas a solution with just JMS will be bad. Similarly, integration of REST services, with load balancing/ failover to multiple backend instances can be done better with an ESB supporting HTTP/S natively.” It only added to my confusion !!!
  3. Is the usage of EAI framework (Apache Camel etc.) an approach entirely different from the pure JMS or ESB approach? If yes, how and what are the pros/cons?
  4. I was told that ESB alone won’t help, BPM(or something else?) needs to be used to define and store the ‘routing’ logic – is this true?
like image 550
Kaliyug Antagonist Avatar asked Apr 27 '12 16:04

Kaliyug Antagonist


People also ask

What is exactly once processing?

Within Integration Server, exactly-once processing is a facility that ensures one-time processing of a guaranteed document received by a trigger. Integration Server provides exactly-once processing by performing duplicate detection and by supplying the ability to retry trigger services.

How do you get exactly once delivered in a text?

This can be achieved in two ways: Deduplication: dropping messages if they are received more than once. Idempotent processing: applying messages more than once has precisely the same effect as applying it exactly once.

How do you handle exactly once in Kafka?

This exactly once process works by committing the offsets by producers instead of consumer. i.e., the produce of result to kafka and committing the consumed messages all are done by kafka producer (instead of separate kafka consumer and producer) which brings the exactly once.

Does Kafka support exactly once delivery?

Initially, Kafka only supported at-most-once and at-least-once message delivery. However, the introduction of Transactions between Kafka brokers and client applications ensures exactly-once delivery in Kafka.


2 Answers

I see the point. This might be a bit tricky with "pure" JMS.

What you essentially want to do is to let the portal publish messages to a topic, but not let the PAYROLLAPPs subscribe to that topic (since all of them would get a copy of the message). So what you would need is some logic in between that distributes the message from the topic subscription to one queue per application type. From that queue, normal load balanced (the competing consumer pattern) can be implemented with JMS.

Different JMS providers have special implementations that can acomplish this task ActiveMQ has its Virtual Destinations, WebSphere MQ has its server side subscriptions that can subscribe from a topic to a queue. In the case your JMS provider does not have any way to handle this, you might want to look at adding some routing middleware to your topology. Apache Camel is a nice, lightweight one, but there are lots of others that can setup some routing in the middle without affecting the real applications.

Update for detailed questions

  1. The Queues below the line has to be there for sure (if your applications uses messaging). The "Some distrib. logic" box shouldn't be needed. The "Some routing logic" box could be an ESB or in this very case, be implemented in the messaging server, for instance ActiveMQ with virtual destinations (or WebSphere MQ or perhaps RabbitMQ among others).

  2. There are a lot of buzwords in the domain of integration. Simplified (depending on who you ask - ESB can also be seen as an architectural pattern, but let's keep it simple), an ESB is a server application (or a topology of multiple servers in practice) that is a centerpiece of an integration landscape. The ESB server simply contain logic and small message flows that takes messages (files, or whatever) from one application and routes them to many applications, transform them to other formats, encrypts, converts from one transport protocol such as HTTP/SOAP to File etc.

JMS is a rather confusing and missused word. Java has to some extent dominated the enterprise messaging domain in the last years, so JMS is sometimes used pretty much as a synonym to Messaging. However, Messaging, (or message queueing, asynchronous messaging, MOM=message oriented middleware, etc.) is to be simply considered as a family of similar transport protocols that features a central relaying server. Is is not at all a Java only thing. Many successful ESBs setups I worked with actually leverage on a Messaging backbone

  1. In your situation, I would not go too deep into the academical/philosophical differences between ESB and EAI software. They will most likely do pretty much the same things for you. Instead, look at the hard facts such as price, support, resource footprint, monitoring, tech. features, learning curve etc. Be it Camel/ServiceMix, Mule, JBoss ESB, Microsoft BizTalk, IBM Message Broker, Tibco etc.

  2. Hah! Was it perhaps a salesman? An ESB will do just fine. A Messaging server will do in your case as well, such as ActiveMQ as has been pointed out already. BPM suits are fine for orchestrating semi-automatized business processes or if there is major business logic in the integration layer. Otherwise, avoid that added complexity.

like image 148
Petter Nordlander Avatar answered Jan 08 '23 19:01

Petter Nordlander


  1. Irrespective of the solution(pure JMS, ESB, EAI), does the part below the horizontal dotted line(application-specific queues) needs to be implemented?

The consumers need to be implemented in such a way that work with you chosen solution but you shouldn't have to worry about the creation of the queue per consumer or the distribution logic (assuming that the consumers can consume directly from the chosen tech)

  1. How does the usage of ESB(JBoss ESB etc.), instead of ‘pure’ JMS(Active MQ etc.), help/ hamper? Does ESB provide any advantage over JMS which is ‘java-only’(?). I am hell confused – ‘ESB or JMS’, even after referring threads like these : JMS and ESB - how they are related?. It has one reply which says “JMS is not well suited for the integration of REST services, File systems, S/FTP, Email, Hessian, SOAP etc. which are better handled with an ESB that supports these types natively. For example, if you have a process that dumps a CSV file of 500MB at midnight, and you want another system to pickup the file, parse CSV and import into a database, this can easily be accomplished by an ESB - whereas a solution with just JMS will be bad. Similarly, integration of REST services, with load balancing/ failover to multiple backend instances can be done better with an ESB supporting HTTP/S natively.” It only added to my confusion !!!

My opinion is that ESB would overcomplicate this solution. It's designed (amongst other things) to assist integration with different technologies, but simpler solutions do this too - e.g - Apache Camel provides a very easy way of communicating using a huge variety of transports (including ActiveMQ).

Not all JMS implementations cater for connectivity from other languages, but ActiveMQ does using it's STOMP connector.

  1. Is the usage of EAI framework (Apache Camel etc.) an approach entirely different from the pure JMS or ESB approach? If yes, how and what are the pros/cons?

Apache Camel and JMS are complementary technologies, as are JMS and ESB. Camel (& Spring Integration) are lightweight, simple and portable. ESB's are much more heavyweight and will normally lead to greater coupling with the ESB/application server.

  1. I was told that ESB alone won’t help, BPM(or something else?) needs to be used to define and store the ‘routing’ logic – is this true?

It depends what your 'routing' logic is, it looks to me like you don't require routing logic, you just require guaranteed delivery to 1payroll consumer and 1 helpdesk consumer. BPM would be more useful where you want to selectively public data/invoke a service based on some characteristic of that data.

I strongly suggest reading http://activemq.apache.org/virtual-destinations.html, using these you would:

  1. Send messages to the ActiveMQ broker, onto a VirtualTopic, e.g. VirtualTopic.X
  2. Register the Payroll and Helpdesk consumers, as consumers on queues that ActiveMQ dynamically creates on the topic - e.g. Consumer.Payroll.VirtualTopic.X. Both Payroll consumers should be registered with the same string.
  3. ActiveMQ will automatically retain a marker that represents what each set of consumers hasn't consumed. This means that 100% of messages will be processed by a Payroll consumer but a message will never be sent to > 1 payroll consumer.
  4. Add/remove consumers at will.

N.B.I believe that other products, e.g. Apache QPID provide similar functionality - I'm just most aware of ActiveMQ, and have had success with this approach

like image 36
user793370 Avatar answered Jan 08 '23 20:01

user793370