Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do database transactions prevent race conditions?

It's not entirely clear to me what transactions in database systems do. I know they can be used to rollback a list of updates completely (e.g. deduct money on one account and add it to another), but is that all they do? Specifically, can they be used to prevent race conditions? For example:

// Java/JPA example em.getTransaction().begin(); User u = em.find(User.class, 123); u.credits += 10; em.persist(u); // Note added in 2016: this line is actually not needed em.getTransaction().commit(); 

(I know this could probably be written as a single update query, but that's not alway the case)

Is this code protected against race conditions?

I'm mostly interested in MySQL5 + InnoDB, but general answers are welcome too.

like image 712
Bart van Heukelom Avatar asked Jun 25 '11 11:06

Bart van Heukelom


People also ask

How can race conditions be prevented?

To avoid race conditions, any operation on a shared resource – that is, on a resource that can be shared between threads – must be executed atomically. One way to achieve atomicity is by using critical sections — mutually exclusive parts of the program.

What is a race condition in databases?

A race condition occurs when two threads access a shared variable at the same time. The first thread reads the variable, and the second thread reads the same value from the variable.

Can data races cause race conditions?

Race conditions are sometimes produced by data races, which occur when two threads concurrently target the same memory location and at least one is a write operation. Data races are easier to detect than race conditions because specific conditions are required for them to occur.


1 Answers

TL/DR: Transactions do not inherently prevent all race conditions. You still need locking, abort-and-retry handling, or other protective measures in all real-world database implementations. Transactions are not a secret sauce you can add to your queries to make them safe from all concurrency effects.

Isolation

What you're getting at with your question is the I in ACID - isolation. The academically pure idea is that transactions should provide perfect isolation, so that the result is the same as if every transaction executed serially. In reality that's rarely the case in real RDBMS implementations; capabilities vary by implementation, and the rules can be weakened by use of a weaker isolation level like READ COMMITTED. In practice you cannot assume that transactions prevent all race conditions, even at SERIALIZABLE isolation.

Some RDBMSs have stronger abilities than others. For example, PostgreSQL 9.2 and newer have quite good SERIALIZABLE isolation that detects most (but not all) possible interactions between transactions and aborts all but one of the conflicting transactions. So it can run transactions in parallel quite safely.

Few, if any3, systems have truly perfect SERIALIZABLE isolation that prevents all possible races and anomalies, including issues like lock escalation and lock ordering deadlocks.

Even with strong isolation some systems (like PostgreSQL) will abort conflicting transactions, rather than making them wait and running them serially. Your app must remember what it was doing and re-try the transaction. So while the transaction has prevented concurrency-related anomalies from being stored to the DB, it's done so in a manner that is not transparent to the application.

Atomicity

Arguably the primary purpose of a database transaction is that it provides for atomic commit. The changes do not take effect until you commit the transaction. When you commit, the changes all take effect at the same instant as far as other transactions are concerned. No transaction can ever see just some of the changes a transaction makes1,2. Similarly, if you ROLLBACK, then none of the transaction's changes ever get seen by any other transaction; it's as if your transaction never existed.

That's the A in ACID.

Durability

Another is durability - the D in ACID. It specifies that when you commit a transaction it must truly be saved to storage that will survive a fault like power loss or a sudden reboot.

Consistency:

See wikipedia

Optimistic concurrency control

Rather than using locking and/or high isolation levels, it's common for ORMs like Hibernate, EclipseLink, etc to use optimistic concurrency control (often called "optimistic locking") to overcome the limitations of weaker isolation levels while preserving performance.

A key feature of this approach is that it lets you span work across multiple transactions, which is a big plus with systems that have high user counts and may have long delays between interactions with any given user.

References

In addition to the in-text links, see the PostgreSQL documentation chapter on locking, isolation and concurrency. Even if you're using a different RDBMS you'll learn a lot from the concepts it explains.


1I'm ignoring the rarely implemented READ UNCOMMITTED isolation level here for simplicity; it permits dirty reads.

2As @meriton points out, the corollary isn't necessarily true. Phantom reads occur in anything below SERIALIZABLE. One part of an in-progress transaction doesn't see some changes (by a not-yet-committed transaction), then the next part of the in-progress transaction does see the changes when the other transaction commits.

3 Well, IIRC SQLite2 does by virtue of locking the whole database when a write is attempted, but that's not what I'd call an ideal solution to concurrency issues.

like image 112
Craig Ringer Avatar answered Oct 12 '22 19:10

Craig Ringer