I'm working on PoC for small process engine based on Camel. Requirements are to have ability to execute set of consequence steps and each of them could potentially take hours to execute. Asynchronous communication style is obvious choice in this case but I'm struggle with getting "process" part right.
When sending message to external system I need to wait for completion. As long as it might take lot of time I'm thinking about stopping processing of concrete step after I sent message and then starting new "job" upon getting completion message back. So literally processing of each step will be handled as a Camel route starting at same JMS queue and then content-based router will decide which concrete logic should be executed based on headers of message or its content.
The problem, however, is to how to avoid potential of messages loss. For example, at concrete step I'm sending message and stop processing. The external system for some reason did not process message thus my system does not receive any notification. This means that process is stuck unless some other component will generate message to wake it up.
Also as long as system could be shutdown at any point of time I have to build in logic to continue processing messages after restart (which implies some kind of message persistence, redelivery and transaction management strategy).
All this issues add up thus I'd like to ask experience Camel champions to provide suggestions on how to design such kind of logic using Camel. I know that dedicated BPM product or ESB might handle this problem much more easier but I don't want to bloat the solution.
Any advices are welcomed, especially in terms of Camel capabilities that could help in simplification of solution.
Camel's BAM support should provide you with some of the long-lived process support (timeouts, error handling scenarios, etc). Also, JMS and transactions will help with reliable/persistent messaging requirements, etc.
good luck and let us know if you land on an alternate approach...
I would suggest the Claim Check pattern is the most appropriate for persisting state between long running external invocations. Check-in the state of your process before sending the outbound message.
One way to handle detection of the non-reply from the external process is to post two messages. One message goes to the external process, another goes to an internal queue. I'll call the second one the process timeout message. It is a very small message with just the correlation ID and an appropriate expiration time. If the process is received from the external process, the receiving process will have the correlation id and be able to remove the message from the process timeout queue. If the external process does not reply, then the dead-letter-queue for the process timeout queue should be connected to a camel route that alerts an administrator or takes appropriate automatic action, e.g. retrieving the claim check, etc. This should allow persistent state with a minimum of overhead and no BPM tool and hence no single point of failure.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With