Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PostgreSQL Idle In Transaction Diagnosis and Reading pg_locks

Tags:

postgresql

Setup: multiple webservers, running mod_wsgi, Apache, and pgbouncer which connects to the shared DB running Postgres 8.3.6. Application is running Django.

What we're seeing: 'idle in transaction' queries on the DB that hang out for a long time. In order to see them, I'll run something like this:

SELECT query_start, procpid, client_addr, current_query FROM pg_stat_activity
WHERE query_start < NOW() - interval '5 minutes';

Most results of course are just IDLE connections that pgbouncer is keeping open for use, but sometimes there will be these old 'IDLE in transaction' queries. I understand that this means that there is a query transaction which is waiting for something, or something which had a BEGIN but hasn't reached a COMMIT or ROLLBACK.

My next step was to try to use pg_locks to determine what the process is waiting on:

select pg_class.relname, pg_locks.transactionid, pg_locks.mode,
       pg_locks.granted as "g", pg_stat_activity.current_query,
       pg_stat_activity.query_start,
       age(now(),pg_stat_activity.query_start) as "age",
       pg_stat_activity.procpid 
from pg_stat_activity,pg_locks
left outer join pg_class on (pg_locks.relation = pg_class.oid)  
where pg_locks.pid=pg_stat_activity.procpid
and pg_stat_activity.procpid = <AN IDLE TRANSACTION PROCESS>
order by query_start;

A lot of times, the result I get looks like so:

 relname | transactionid |      mode       | g |     current_query     |         query_start          |       age       |  client_addr   | procpid 
---------+---------------+-----------------+---+-----------------------+------------------------------+-----------------+----------------+---------
         |               | AccessShareLock | t | <IDLE> in transaction | 2010-07-22 15:33:11.48136-04 | 00:23:35.029045 | 192.168.100.99 |    1991
         |               | AccessShareLock | t | <IDLE> in transaction | 2010-07-22 15:33:11.48136-04 | 00:23:35.029045 | 192.168.100.99 |    1991
         |               | AccessShareLock | t | <IDLE> in transaction | 2010-07-22 15:33:11.48136-04 | 00:23:35.029045 | 192.168.100.99 |    1991
         |               | AccessShareLock | t | <IDLE> in transaction | 2010-07-22 15:33:11.48136-04 | 00:23:35.029045 | 192.168.100.99 |    1991
         |               | AccessShareLock | t | <IDLE> in transaction | 2010-07-22 15:33:11.48136-04 | 00:23:35.029045 | 192.168.100.99 |    1991
         |               | AccessShareLock | t | <IDLE> in transaction | 2010-07-22 15:33:11.48136-04 | 00:23:35.029045 | 192.168.100.99 |    1991
         |               | AccessShareLock | t | <IDLE> in transaction | 2010-07-22 15:33:11.48136-04 | 00:23:35.029045 | 192.168.100.99 |    1991
         |               | AccessShareLock | t | <IDLE> in transaction | 2010-07-22 15:33:11.48136-04 | 00:23:35.029045 | 192.168.100.99 |    1991
         |               | ExclusiveLock   | t | <IDLE> in transaction | 2010-07-22 15:33:11.48136-04 | 00:23:35.029045 | 192.168.100.99 |    1991
         |               | AccessShareLock | t | <IDLE> in transaction | 2010-07-22 15:33:11.48136-04 | 00:23:35.029045 | 192.168.100.99 |    1991
(10 rows)

I'm not sure how to read this (I guess it stems from not really understanding pg_locks). There's no relname, so is it saying that it's waiting on nothing? I thought that if granted was 'true', it had the lock. Since all these results are granted, is pg_locks showing me the locks that it has rather than what it's waiting for?

Right now I'm 'fixing' this by restarting Apache, which seems to shake the transactions loose, but obviously that's not a real solution. I'm looking for Postgres to give me a place on where to look to figure this out, especially since Django is supposed to manage its connections and transactions automatically.

like image 430
KRH Avatar asked Jul 22 '10 20:07

KRH


1 Answers

For Django specifically, this entry details why you see this issue:

Threaded Django task...

I say "specifically" here because the real problem is web frameworks/drivers/ORMs working all the time in a transaction-based mode (and sometimes calling rollback after every freakin' SELECT query) when they should really be running in an Auto-Commit mode and handling the need for transactions only on an as-needed basis. The Apache::Sessions PostgreSQL persistence module was a disaster (at least a few years ago) as it only closed transactions when it was garbage collected. Yikes!

like image 98
Matthew Wood Avatar answered Oct 07 '22 19:10

Matthew Wood