I'm having trouble understanding how locks interact with transactions in Postgres. When I run this (long) query, I am surprised by the high degree of locking that occurs: <pre class="prettyprint"><code>BEGIN; TRUNCATE foo; \COPY foo FROM 'backup.txt'; COMMIT; </code></pre> The documentation for <code>\COPY</code> doesn't mention what level of lock it requires, but this post indicates that it only gets a RowExclusiveLock. But when I run this query during the <code>\COPY</code>: <pre class="prettyprint"><code>SELECT mode, granted FROM pg_locks WHERE relation='foo'::regclass::oid; </code></pre> I get this: <pre class="prettyprint"><code>mode granted RowExclusiveLock true ShareLock true AccessExclusiveLock true </code></pre> Where the heck is that AccessExclusiveLock coming from? I assume it's coming from the <code>TRUNCATE</code>, which requires an AccessExclusiveLock. But the <code>TRUNCATE</code> finishes quickly, so I'd expect the lock to release quickly as well. This leaves me with a few questions. When a lock is acquired by a command within a transaction, is that lock released at the end of the command (before the end of the transaction)? If so, why do I observe the above behavior? If not, why not? In fact, since transactions don't touch the table until the <code>COMMIT</code>, why does a <code>TRUNCATE</code> in a transaction need to block the table at all? I don't see any discussion of this in the documentation for transactions in PG.

There are a couple of misconceptions to be cleared up here. First, a transaction does touch the table before it is committed. The comment you are quoting says that a <code>ROLLBACK</code> (and a <code>COMMIT</code> as well) don't touch the table, which is something different. They record the transaction state in the commit log (in <code>pg_clog</code>), and <code>COMMIT</code> flushes the transaction log to disk (one notable exception to this is <code>TRUNCATE</code>, which is relevant for your question: the old table is kept around until the end of the transaction and gets deleted during <code>COMMIT</code>). If all changes were held back until <code>COMMIT</code> and no locks would be taken, <code>COMMIT</code> would be quite expensive and would routinely fail because of concurrent modifications. The transaction would have to remember the state of the database as it was before and check if the changes still apply. This way of handling concurrency is called optimistic concurreny control, and while it is a decent strategy for an application, it won't work well for a relational database, where <code>COMMIT</code> should be efficient and should not fail (unless there are major problems with the infrastructure). So what relational databases use is pessimistic concurrency control or locking, i.e. they lock a database object before they access it to prevent concurrent activity from getting in their way. Second, relational databases use two-phase locking, in which locks (at least the user-visible, so-called heavyweight locks) are always held until the end of the transaction. This is necessary (but not sufficient) to keep transactions in a logical order (serializable) and consistent. What if you release a lock, and somebody else removes the row that your inserted, but uncommitted row refers to via a foreign key constraint? Answer to the question The upshot of all this is that your table will keep the <code>ACCESS EXCLUSIVE</code> lock from <code>TRUNCATE</code> until the end of the transaction. Isn't it evident why that is necessary? If other transactions were allowed to even read the table after the (as of yet uncommitted) <code>TRUNCATE</code>, they would find it empty, since <code>TRUNCATE</code> really empties the table and does not adhere to MVCC semantics. Such a dirty read (of uncommitted data that might yet be rolled back) cannot be allowed. If you really need read access to the table during the refill, you could use <code>DELETE</code> instead of <code>TRUNCATE</code>. The downside is that this is a much more expensive operation that will leave the table with a lot of “dead tuples” that have to be removed by autovacuum, resulting in a lot of empty space (table bloat). But if you are willing to live with a table and indexes that are bloated such that table and index scans will take at least twice as long, it is an option.

Postgres locks within a transaction

Tags:

postgresql

locking

transactions

postgresql-9.4

I'm having trouble understanding how locks interact with transactions in Postgres.

When I run this (long) query, I am surprised by the high degree of locking that occurs:

BEGIN;
TRUNCATE foo;
\COPY foo FROM 'backup.txt';
COMMIT;

The documentation for \COPY doesn't mention what level of lock it requires, but this post indicates that it only gets a RowExclusiveLock. But when I run this query during the \COPY:

SELECT mode, granted FROM pg_locks
WHERE relation='foo'::regclass::oid;

I get this:

mode    granted
RowExclusiveLock    true
ShareLock   true
AccessExclusiveLock true

Where the heck is that AccessExclusiveLock coming from? I assume it's coming from the TRUNCATE, which requires an AccessExclusiveLock. But the TRUNCATE finishes quickly, so I'd expect the lock to release quickly as well. This leaves me with a few questions.

When a lock is acquired by a command within a transaction, is that lock released at the end of the command (before the end of the transaction)? If so, why do I observe the above behavior? If not, why not? In fact, since transactions don't touch the table until the COMMIT, why does a TRUNCATE in a transaction need to block the table at all?

I don't see any discussion of this in the documentation for transactions in PG.

402

asked Nov 22 '16 21:11

Simon Lepkin

1 Answers

There are a couple of misconceptions to be cleared up here.

First, a transaction does touch the table before it is committed. The comment you are quoting says that a ROLLBACK (and a COMMIT as well) don't touch the table, which is something different. They record the transaction state in the commit log (in pg_clog), and COMMIT flushes the transaction log to disk (one notable exception to this is TRUNCATE, which is relevant for your question: the old table is kept around until the end of the transaction and gets deleted during COMMIT).

If all changes were held back until COMMIT and no locks would be taken, COMMIT would be quite expensive and would routinely fail because of concurrent modifications. The transaction would have to remember the state of the database as it was before and check if the changes still apply. This way of handling concurrency is called optimistic concurreny control, and while it is a decent strategy for an application, it won't work well for a relational database, where COMMIT should be efficient and should not fail (unless there are major problems with the infrastructure).

So what relational databases use is pessimistic concurrency control or locking, i.e. they lock a database object before they access it to prevent concurrent activity from getting in their way.

Second, relational databases use two-phase locking, in which locks (at least the user-visible, so-called heavyweight locks) are always held until the end of the transaction. This is necessary (but not sufficient) to keep transactions in a logical order (serializable) and consistent. What if you release a lock, and somebody else removes the row that your inserted, but uncommitted row refers to via a foreign key constraint?

Answer to the question

The upshot of all this is that your table will keep the ACCESS EXCLUSIVE lock from TRUNCATE until the end of the transaction. Isn't it evident why that is necessary? If other transactions were allowed to even read the table after the (as of yet uncommitted) TRUNCATE, they would find it empty, since TRUNCATE really empties the table and does not adhere to MVCC semantics. Such a dirty read (of uncommitted data that might yet be rolled back) cannot be allowed.

If you really need read access to the table during the refill, you could use DELETE instead of TRUNCATE. The downside is that this is a much more expensive operation that will leave the table with a lot of “dead tuples” that have to be removed by autovacuum, resulting in a lot of empty space (table bloat). But if you are willing to live with a table and indexes that are bloated such that table and index scans will take at least twice as long, it is an option.

answered Oct 02 '22 04:10

Laurenz Albe

Related questions
                            
                                add schema to path in postgresql
                            
                                Converting an int to an IP address
                            
                                rake aborted! no such file to load -- pg (Creating postgre database in rails )
                            
                                What are alternatives to JDBC driver for access PostgreSQL database
                            
                                How do I export a postgresql database schema to an XML format?
                            
                                Is there a library for sanitizing query parameters for PostgreSQL or SQL in general, for FreePascal and Delphi?
                            
                                Insert multiple ENUM values in PostgreSQL
                            
                                Replace unicode characters in PostgreSQL
                            
                                md5() works with literal, but not with column data
                            
                                How Do I Change the Dialect in Dapper Extensions?
                            
                                Storing images in bytea fields in a PostgreSQL database
                            
                                Is there a way to disable updates/deletes but still allow triggers to perform them?
                            
                                PostgreSQL drop table command is not reducing DB size
                            
                                How do I access the database from my browser?
                            
                                Updating primary keys in POSTGRESQL
                            
                                How to connect to localhost with postgres_fdw?
                            
                                Postgres restore from .dump file: Invalid byte sequence for encoding "UTF8"
                            
                                Postgres won't connect to server in Ruby on Rails app using c9.io
                            
                                PostgreSQL: How to multiply two columns and display result in same query?
                            
                                hibernate: how to select all rows in a table

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With