Data consistency in XA transactions

Tags:

Suppose we have a database (e.g. Oracle) and a JMS provider (e.g. HornetQ) participating in an XA transaction. A message is sent to a JMS queue and some data are persisted in the database in the same distributed transaction. After the transaction is committed, a message consumer will read the persisted data and process them in a separate transaction.

Regarding the first XA transaction, the following sequence of events may be executed by the transaction manager (e.g. JBoss)

prepare (HornetQ)
prepare (Oracle)
commit (HornetQ)
commit (Oracle)

What happens if the message consumer starts reading the data after commit is completed in HornetQ, but is still being executed in Oracle? Will the message consumer read stale data?

The question can be generalized to any kind of multiple resources participating in XA transactions, i.e. is there a possibility for a small time window (when commit phases are executed) in which a reader from another concurrent transaction can get an inconsistent state (by reading committed data from one resource and stale data from another one)?

I would say that the only way for transactional resources to prevent this is to block all readers of affected data once the prepare phase is completed until the commit is issued. This way the example message consumer mentioned above would block until data is committed in the database.

580

asked Jun 08 '16 18:06

Dragan Bozanovic

2 Answers

Unfortunately XA transactions don't support consistency. When mapped to CAP theorem XA solves Availability and Partition tolerence across multiple datastores. In doing so it must sacrifice on Consistency. When using XA you have to embrace eventual consistency.

In any event creating systems that are CP or AP is hard enough that regardless of your datastore or transactional model you will face this problem.

131

answered Sep 28 '22 07:09

Justin

I have a some experience with a bit of different environment based on Weblogic JMS and Oracle 11g. In this answer I suppose that it is working exactly the same. I hope my answer will help you.

In our case there was a "distant" system which was obligatory to notify based on the different events happend inside the local system. The other system also red into our database so the use-case seems almost identical to your problem. The sequence of the events was exacly the same as yours. On the test systems there was not a single faulire. Everyone thought that it will work but some of us doubted if it is the correct solution. As the software hit production some of the BPM processes run unpredictably. So a simple answer to your question: yes it is possible and everyone should be aware it.

Our solution (in my opinion) was not a well planned one, but we recognised that the little time window between the two commit is braking the system, so we added some "delay" to the queue (if I remember it was like 1-2 minutes). It was enough to finish the other commit and read consistent data. In my point of view it is not the best solution. It is not solving the syncronisation problem (what if an oracle transaction is longer than 1-2mins?).

Here is a great blog post that is worth to read and the last solution seems the best to me. We implemented it in an other system and it is working way better. Important to notice that you should limit the retries (re-reads) to prevent "stuck" threads. (With some error reporting.) With this restrictions I was not able to find better solution so far, so if anyone got some better option I am looking forward to hear it. :)

Edit: typos.

answered Sep 28 '22 06:09

Hash

Related questions
                            
                                Code folding in RStudio: Creating hierarchy in the code
                            
                                How to get a remote MAC address via IPv6 in iOS programmatically
                            
                                What does it mean to unroll a RNN dynamically?
                            
                                Clearing Webpack cache
                            
                                Render blocking defer vs moving script at bottom
                            
                                what does VOLUME command do in Dockerfile?
                            
                                Ionic 3 - xcode error with cocoapods
                            
                                How to structure Machine Learning projects using Object Oriented programming in Python? [closed]
                            
                                Firebase auth onUpdate cloud function for when a user updates their email
                            
                                How to create a repeating animated moving gradient drawable, like an indeterminate progress?
                            
                                Deleting Apollo Client cache for a given query and every set of variables
                            
                                Why are iframe requests not sending cookies?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With