I am seeing quite a few occurrences of the following in my Postgres server log:
LOG: process x still waiting for ShareLock on transaction y after 1000.109 ms
DETAIL: Process holding the lock: z. Wait queue: x.
CONTEXT: while inserting index tuple (a,b) in relation "my_test_table"
SQL function "my_test_function" statement 1
...
LOG: process x acquired ShareLock on transaction y after 1013.664 ms
CONTEXT: while inserting index tuple (a,b) in relation "my_test_table"
I am running Postgres 9.5.3. In addition I am running on Heroku so I don't have access to the fine grained superuser-only debugging tools.
I am wondering how best to debug such an issue given these constraints and the fact each individual lock is relatively transient (generally 1000-2000ms).
Things I have tried:
pg_locks
(and joining to pg_class
for context).pageinspect
.pgbench
where I do have superuser perms. I have so far been unable to replicate the issue locally (I suspect due to having a much smaller data set but I can't be sure).It is worth noting that CPU utilisation appears high (load average of >1) when I see these issues so it's possible there is nothing wrong with the above per se and that I'm seeing it as a consequence of insufficient system resources being available. I would still like to understand how best to debug it though so I can understand what exactly is happening.
With two concurrent update queries, postgres can end up in a deadlock in the same way that an application could cause postgres to deadlock. The way we can avoid deadlocks in this scenario is to tell postgres to explicitly lock the rows before the update.
Transaction 1 locks table T1 while transaction 2 locks table T2. Then transaction 1 tries to lock T2 and waits for transaction 2. If transaction 2 tries to lock T1, transaction 1 and 2 wait for each other and a deadlock happens. When a deadlock is detected, transaction 2 aborts and transaction 1 is successful.
The view pg_locks provides access to information about the locks held by open transactions within the database server. See Chapter 13 for more discussion of locking. pg_locks contains one row per active lockable object, requested lock mode, and relevant transaction.
The select ... for update acquires a ROW SHARE LOCK on a table. This lock conflicts with the EXCLUSIVE lock needed for an update statement, and prevents any changes that could happen concurrently. All the locks will be released when the transaction ends.
The key thing is that it's a ShareLock on the transaction.
This means that one transaction is waiting for another to commit/rollback before it can proceed. It's only loosely a "lock". What's happening here is that a PostgreSQL transaction takes an ExclusiveLock on its own transaction ID when it starts. Other transactions that want to wait for it to finish can try to acquire a ShareLock on the transaction, which will block until the ExclusiveLock is released on commit/abort. It's basically using the locking mechanism as a convenience to implement inter-transaction completion signalling.
This usually happens when the waiting transaction(s) are trying to INSERT
a UNIQUE
or PRIMARY KEY
value for a row that's recently inserted/modified by the waited-on transaction. The waiting transactions cannot proceed until they know the outcome of the waited-on transaction - whether it committed or rolled back, and if it committed, whether the target row got deleted/inserted/whatever.
That's consistent with what's in your error message. proc "x" is trying to insert into "my_test_table" and has to wait until proc "y" commits xact "z" to find out whether to raise a unique violation or whether it can proceed.
Most likely you have contention in some kind of upsert or queue processing system. This can also happen if you have some function/transaction pattern that tries to insert into a heavily contended table, then does a lot of other time consuming work before it commits.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With