I've read a post saying that:
We can not implement traditional transaction system like 2 phase commit in micro-services in a distributed environment.
I agree completely with this.
But it would be great if someone here can explain the exact reason for this. What would be the issues I'm going to face if I'm implementing 2-phase commit with microservices?
Thanks in advance
The problem with 2PC is that it is quite slow compared to the time for operation of a single microservice. Coordinating the transaction between microservices, even if they are on the same network, can really slow the system down, so this approach isn't usually used in a high load scenario.
Two-phase commit (2PC) is a standardized protocol that ensures atomicity, consistency, isolation and durability (ACID) of a transaction; it is an atomic commitment protocol for distributed systems.
The greatest disadvantage of the two-phase commit protocol is that it is a blocking protocol. If the coordinator fails permanently, some participants will never resolve their transactions: After a participant has sent an agreement message to the coordinator, it will block until a commit or rollback is received.
As its name hints, 2pc has two phases: A prepare phase and a commit phase. In the prepare phase, all microservices will be asked to prepare for some data change that could be done atomically. Once all microservices are prepared, the commit phase will ask all the microservices to make the actual changes.
The main reason for avoiding 2-phase commit is, the transaction co-ordinator is a kind of dictator as it tells all other nodes what to do. Usually the transaction co-ordinator is embedded in the application server. The problem happens when after the 1st phase or prepare phase the transaction co-ordinator or the application server goes down. Now, the participating nodes don't know what to do. They cannot commit because they don't know if others have replied to the co-ordinator with a "no" and they cannot rollback because others might have said a "yes" to the co-ordinator. So, until the co-ordinator comes back after 15 minutes (say) and completes the 2nd phase, the participating data stores will remain in a locked state. This inhibits scalability and performance. Worse things happen when the transaction log of the co-ordinator gets corrupted after the 1st phase. In that case, the data stores remain in the locked state forever. Even restarting the processes won't help. The only solution is to manually check the data to ensure consistancy and then remove the locks. These things usually happen in a high pressure situation and therefore it's definitely a huge operational overhead. Hence the traditional 2-phase commit is not a good solution.
However, it should be noted here that some of the modern systems like Kafka have also implemented a 2-phase commit. But this is different from the traditional solution in that here every broker can be a co-ordinator and thus the Kafka's leader election algorithm and the replication model alleviate the issues mentioned in the traditional model.
Some things to note and also give some background:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With