Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strange deadlock PostgreSQL deadlock issue with SELECT FOR UPDATE

I am building a locking system based on PostgreSQL, I have two methods, acquire and release.

For acquire, it works like this

BEGIN
while True:
    SELECT id FROM my_locks WHERE locked = false AND id = '<NAME>' FOR UPDATE
    if no rows return:
        continue
    UPDATE my_locks SET locked = true WHERE id = '<NAME>'
    COMMIT
    break

And for release

BEGIN
UPDATE my_locks SET locked = false WHERE id = '<NAME>'
COMMIT

This looks pretty straightforward, but it doesn't work. The strange part of it is, I thought

SELECT id FROM my_locks WHERE locked = false AND id = '<NAME>' FOR UPDATE

should only acquire the lock on target row only if the target row's locked is false. But in reality, it's not like that. Somehow, even no locked = false row exists, it acquire lock anyway. As a result, I have a deadlock issue. It looks like this

Select for update dead lock issue

Release is waiting for SELECT FOR UPDATE, and SELECT FOR UPDATE is doing infinite loop while it's holding a lock for no reason.

To reproduce the issue, I wrote a simple test here

https://gist.github.com/victorlin/d9119dd9dfdd5ac3836b

You can run it with psycopg2 and pytest, remember to change the database setting, and run

pip install pytest psycopg2
py.test -sv test_lock.py
like image 828
Fang-Pen Lin Avatar asked Aug 06 '15 08:08

Fang-Pen Lin


1 Answers

The test case plays out like this:

  • Thread-1 runs the SELECT and acquires the record lock.
  • Thread-2 runs the SELECT and enters the lock's wait queue.
  • Thread-1 runs the UPDATE / COMMIT and releases the lock.
  • Thread-2 acquires the lock. Detecting that the record has changed since its SELECT, it rechecks the data against its WHERE condition. The check fails, and the row is filtered out of the result set, but the lock is still held.

This behaviour is mentioned in the FOR UPDATE documentation:

...rows that satisfied the query conditions as of the query snapshot will be locked, although they will not be returned if they were updated after the snapshot and no longer satisfy the query conditions.

This can have some unpleasant consequences, so a superfluous lock isn't that bad, all things considered.

Probably the simplest workaround is to limit the lock duration by committing after every iteration of acquire. There are various other ways to prevent it from holding this lock (e.g. SELECT ... NOWAIT, running in a REPEATABLE READ or SERIALIZABLE isolation level, SELECT ... SKIP LOCKED in Postgres 9.5).

I think the cleanest implementation using this retry-loop approach would be to skip the SELECT altogether, and just run an UPDATE ... WHERE locked = false, committing each time. You can tell if you acquired the lock by checking cur.rowcount after calling cur.execute(). If there is additional information you need to pull from the lock record, you can use an UPDATE ... RETURNING statement.

But I would have to agree with @Kevin, and say that you'd probably be better off leveraging Postgres' built-in locking support than trying to reinvent it. It would solve a lot of problems for you, e.g.:

  • Deadlocks are automatically detected
  • Waiting processes are put to sleep, rather than having to poll the server
  • Lock requests are queued, preventing starvation
  • Locks would (generally) not outlive a failed process

The easiest way might be to implement acquire as SELECT FROM my_locks FOR UPDATE, release simply as COMMIT, and let the processes contend for the row lock. If you need more flexibility (e.g. blocking/non-blocking calls, transaction/session/custom scope), advisory locks should prove useful.

like image 81
Nick Barnes Avatar answered Sep 17 '22 23:09

Nick Barnes