Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it safe to rely on Postgres' deadlock detection for concurrency control?

Tags:

I've been running into occasional deadlocks in my application because two transactions which need to update the same rows but in different orders (ex, transaction A updates rows X then Y, while transaction B updates rows Y then X).

For various Reasons, the traditional approaches to resolving avoiding this kind of deadlock – locking, or updating rows in a consistent order – are less than ideal.

Since the updates I'm trying to perform are otherwise idempotent and order-independent, is it safe and reasonable to simply catch these occasional deadlocks at the application level and retry the transaction?

For example:

def process_update(update):
    attempt = 0
    while attempt < 10:
        try:
            execute("SAVEPOINT foo")
            for row in update:
                execute("UPDATE mytable SET … WHERE …", row)
            execute("RELEASE SAVEPOINT foo")
            break
        except Deadlock:
            execute("ROLLBACK TO SAVEPOINT foo")
        attempt += 1
    raise Exception("Too many retries")

Is this a reasonable idea? Or are there costs associated with Postgres' deadlock detection that might make it dangerous?

like image 827
David Wolever Avatar asked Feb 18 '17 19:02

David Wolever


1 Answers

I did a lot of research and experimentation into this for a system that's running 50 to 100 concurrent processes on the same tables. There are a number of transaction failures that can happen besides basic deadlocks. My case includes both read committed and serializable transactions. There's no situation where handling this at the application level caused any issues. Fortunately Postgres will fail immediately, so the only performance hit is to the application, nothing significant to the database.

The key components are catching every type of error, knowing which cases require a rollback, and having an exponential backoff for retries. I found that immediate retries or static sleep times cause processes to simply deadlock each other repeatedly and cause a bit of a domino effect, which makes sense.

This is the complete logic my system requires to handle every concurrency issue (pseudocode):

begin transaction (either read committed or serializable)
while not successful and count < 5
    try 
        execute sql
        commit
    except
        if error code is '40P01' or '55P03'
            # Deadlock or lock not available
            sleep a random time (200 ms to 1 sec) * number of retries
        else if error code is '40001' or '25P02'
            # "In failed sql transaction" or serialized transaction failure
            rollback
            sleep a random time (200 ms to 1 sec) * number of retries
            begin transaction
        else if error message is 'There is no active transaction'
            sleep a random time (200 ms to 1 sec) * number of retries
            begin transaction
    increment count
like image 148
Matt S Avatar answered Sep 25 '22 10:09

Matt S