I understand, in a fuzzy sort of way, how regular ACID transactions work. You perform some work on a database in such a way that the work is not confirmed until some kind of commit flag is set. The commit part is based on some underlying assumption (like a single disk block write is atomic). In the event of a catastrophic error, you can just clear out the uncommitted data in the recovery phase.
How do distributed transactions work? In some of the MS documentation I have read that you can somehow perform a transaction across databases and filesystems (among other things).
This technology could be (and probably is) used for installers, where you want the program to be fully installed or fully absent. You simply begin a transaction at the start of the installer. Next you could connect to the registry and filesystem, making the changes that define the installation. When the job is done, simply commit, or rollback if the installation fails for some reason. The registry and filesystem are automatically cleaned for you by this magical distributed transaction coordinator.
How is it possible that two disparate systems can be transacted upon in this fashion? It seems to me that it is always possible to leave the system in an inconsistent state, where the filesystem has committed its changes and the registry has not. I think in MSDTC it is even possible to perform a transaction across the network.
I have read http://blogs.msdn.com/florinlazar/archive/2004/03/04/84199.aspx, but it feels like only the beginning of the explanation, and that step 4 should be expanded considerably.
Edit: From what I gather on http://en.wikipedia.org/wiki/Distributed_transaction, it can be accomplished by a two-phase commit (http://en.wikipedia.org/wiki/Two-phase_commit). After reading this, I'm still not understanding the method 100%, it seems like there is a lot of room for error between the steps.
A distributed transaction is a set of operations on data that is performed across two or more data repositories (especially databases). It is typically coordinated across separate nodes connected by a network, but may also span multiple databases on a single server.
MSDTC maintains its log in the shared drive, and it also failed over to the new primary node. In this case, MSDTC can use the log and handle the in-doubt transactions. In case you do not configure MSDTC in the failover cluster, SQL Server uses the MSDTC running locally on the node.
MSDTC is a separate Windows service which coordinates distributed transactions across SQL Server instances. When deploying SQL Server in a highly available environment like Windows Failover Clustering, there are certain best practices that can make the MSDTC service's behavior more predictable.
About "step 4":
The transaction manager coordinates with the resource managers to ensure that all succeed to do the requested work or none of the work if done, thus maintaining the ACID properties.
This of course requires all participants to provide the proper interfaces and (error-free) implementations. The interface looks like vaguely this:
public interface ITransactionParticipant {
bool WouldCommitWork();
void Commit();
void Rollback();
}
The Transaction manager at commit-time queries all participants whether they are willing to commit the transaction. The participants may only assert this if they are able to commit this transaction under all allowable error conditions (validation, system errors, etc). After all participants have asserted the ability to commit the transaction, the manager sends the Commit()
message to all participants. If any participant instead raises an error or times out, the whole transaction aborts and individual members are rolled back.
This protocol requires participants to have recorded their whole transaction content before asserting their ability to commit. Of course this has to be in a special local transaction log structure to be able to recover from various kinds of failures.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With