Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to debug ShareLock in Postgres

Tags:

postgresql

I am seeing quite a few occurrences of the following in my Postgres server log:

LOG:  process x still waiting for ShareLock on transaction y after 1000.109 ms 
DETAIL:  Process holding the lock: z. Wait queue: x.
CONTEXT:  while inserting index tuple (a,b) in relation "my_test_table"
    SQL function "my_test_function" statement 1 
...
LOG:  process x acquired ShareLock on transaction y after 1013.664 ms
CONTEXT:  while inserting index tuple (a,b) in relation "my_test_table"

I am running Postgres 9.5.3. In addition I am running on Heroku so I don't have access to the fine grained superuser-only debugging tools.

I am wondering how best to debug such an issue given these constraints and the fact each individual lock is relatively transient (generally 1000-2000ms).

Things I have tried:

  • Monitoring pg_locks (and joining to pg_class for context).
  • Investigating pageinspect.
  • Replicating locally both by hand and with pgbench where I do have superuser perms. I have so far been unable to replicate the issue locally (I suspect due to having a much smaller data set but I can't be sure).

It is worth noting that CPU utilisation appears high (load average of >1) when I see these issues so it's possible there is nothing wrong with the above per se and that I'm seeing it as a consequence of insufficient system resources being available. I would still like to understand how best to debug it though so I can understand what exactly is happening.

like image 547
jabclab Avatar asked Sep 19 '17 15:09

jabclab


People also ask

How do I stop Postgres deadlock?

With two concurrent update queries, postgres can end up in a deadlock in the same way that an application could cause postgres to deadlock. The way we can avoid deadlocks in this scenario is to tell postgres to explicitly lock the rows before the update.

Where is deadlock in PostgreSQL?

Transaction 1 locks table T1 while transaction 2 locks table T2. Then transaction 1 tries to lock T2 and waits for transaction 2. If transaction 2 tries to lock T1, transaction 1 and 2 wait for each other and a deadlock happens. When a deadlock is detected, transaction 2 aborts and transaction 1 is successful.

What is PG lock?

The view pg_locks provides access to information about the locks held by open transactions within the database server. See Chapter 13 for more discussion of locking. pg_locks contains one row per active lockable object, requested lock mode, and relevant transaction.

What is select for update Postgres?

The select ... for update acquires a ROW SHARE LOCK on a table. This lock conflicts with the EXCLUSIVE lock needed for an update statement, and prevents any changes that could happen concurrently. All the locks will be released when the transaction ends.


1 Answers

The key thing is that it's a ShareLock on the transaction.

This means that one transaction is waiting for another to commit/rollback before it can proceed. It's only loosely a "lock". What's happening here is that a PostgreSQL transaction takes an ExclusiveLock on its own transaction ID when it starts. Other transactions that want to wait for it to finish can try to acquire a ShareLock on the transaction, which will block until the ExclusiveLock is released on commit/abort. It's basically using the locking mechanism as a convenience to implement inter-transaction completion signalling.

This usually happens when the waiting transaction(s) are trying to INSERT a UNIQUE or PRIMARY KEY value for a row that's recently inserted/modified by the waited-on transaction. The waiting transactions cannot proceed until they know the outcome of the waited-on transaction - whether it committed or rolled back, and if it committed, whether the target row got deleted/inserted/whatever.

That's consistent with what's in your error message. proc "x" is trying to insert into "my_test_table" and has to wait until proc "y" commits xact "z" to find out whether to raise a unique violation or whether it can proceed.

Most likely you have contention in some kind of upsert or queue processing system. This can also happen if you have some function/transaction pattern that tries to insert into a heavily contended table, then does a lot of other time consuming work before it commits.

like image 119
Craig Ringer Avatar answered Oct 16 '22 07:10

Craig Ringer