Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is a "distributed transaction"?

The Wikipedia article for Distributed transaction isn't very helpful.

Can you give a high-level description with more details of what a distributed transaction is?

Also, can you give me an example of why an application or database should perform a transaction that updates data on two or more networked computers?

I understand the classic bank example; I care more about distributed transactions in Web-scale databases like Dynamo, Bigtable, HBase, or Cassandra.

like image 681
Zombie Avatar asked Nov 18 '10 16:11

Zombie


People also ask

What is distributed transaction explain?

A distributed transaction is a set of operations on data that is performed across two or more data repositories (especially databases). It is typically coordinated across separate nodes connected by a network, but may also span multiple databases on a single server.

What are distributed transactions in SQL?

A distributed transaction spans two or more databases. As the transaction manager, DTC coordinates the transaction between SQL Server instances, and other data sources. Each instance of the SQL Server database engine can operate as a resource manager.

What is a distributed transaction Oracle?

A distributed transaction includes one or more statements that, individually or as a group, update data on two or more distinct nodes of a distributed database.

What is the difference between a distributed transaction and a remote transaction?

The transaction can reference only one remote DP. Each SQL statement (or request) can reference only one (the same) remote DP at a time, and the entire transaction can reference and be executed at only one remote DP. A distributed transaction allows a transaction to reference several different local or remote DP sites.

What is a distributed database transaction?

Databases are common transactional resources and, often, transactions span a couple of such databases. In this case, a distributed transaction can be seen as a database transaction that must be synchronized (or provide ACID properties) among multiple participating databases which are distributed among different physical locations.

What is an ad (distributed transaction)?

- Definition from Techopedia What Does Distributed Transaction Mean? What Does Distributed Transaction Mean? A distributed transaction is a type of transaction with two or more engaged network hosts. Generally, hosts provide resources, and a transaction manager is responsible for developing and handling the transaction.

What happens when a distributed transaction reaches its end?

When a distributed transaction reaches its end, in order to maintain the atomicity property of the transaction , it is mandatory that all of the servers involved in the transaction either commit the transaction or abort it.

Why are distributed transactions susceptible to failure?

This makes distributed transactions susceptible to failures, which is why safeguards must be put in place to retain data integrity. For a distributed transaction to occur, transaction managers coordinate the resources (either multiple databases or multiple nodes of a single database).


2 Answers

Distributed transactions span multiple physical systems, whereas standard transactions do not. Synchronization amongst the systems becomes a need which traditionally would not exist in a standard transaction.

From your Wikipedia reference...

...a distributed transaction can be seen as a database transaction that must be synchronized (or provide ACID properties) among multiple participating databases which are distributed among different physical locations...

like image 44
Aaron McIver Avatar answered Sep 30 '22 23:09

Aaron McIver


Usually, transactions occur on one database server:

BEGIN TRANSACTION SELECT something FROM myTable UPDATE something IN myTable COMMIT 

A distributed transaction involves multiple servers:

BEGIN TRANSACTION UPDATE amount = amount - 100 IN bankAccounts WHERE accountNr = 1 UPDATE amount = amount + 100 IN someRemoteDatabaseAtSomeOtherBank.bankAccounts WHERE accountNr = 2 COMMIT 

The difficulty comes from the fact that the servers must communicate to ensure that transactional properties such as atomicity are satisfied on both servers: If the transaction succeeds, the values must be updated on both servers. If the transaction fails, the transaction must be rollbacked on both servers. It must never happen that the values are updated on one server but not updated on the other.

like image 73
Heinzi Avatar answered Sep 30 '22 21:09

Heinzi