Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Deadlocks in PostgreSQL when running UPDATE

I'm a little bit confused reading about PostgreSQL deadlocks.

A typical deadlock example is:

-- Transaction 1 UPDATE customer SET ... WHERE id = 1 UPDATE customer SET ... WHERE id = 2  -- Transaction 2 UPDATE customer SET ... WHERE id = 2 UPDATE customer SET ... WHERE id = 1 

But what if I change the code as follows:

-- Transaction 1 UPDATE customer SET ... WHERE id IN (1, 2)  -- Transaction 2 UPDATE customer SET ... WHERE id IN (1, 2) 

Will be a possibility of deadlock here?

Essentially my question is: in the 2nd case does PostgreSQL lock rows one-by-one, or lock the entire scope covered by the WHERE condition?

Thanks in advance!

like image 891
vyakhir Avatar asked Apr 20 '12 11:04

vyakhir


People also ask

Does Postgres lock on update?

Google BigQuery & PostgreSQL : Big Query for Data AnalysisRows modified by UPDATE and DELETE are then exclusively locked automatically for the duration of the transaction. This prevents other users from changing the row until the transaction is either committed or rolled back.

What causes deadlock in PostgreSQL?

In PostgreSQL, when a transaction cannot acquire the requested lock within a certain amount of time (configured by `deadlock_timeout`, with default value of 1 second), it begins deadlock detection.

How can we avoid deadlock while updating SQL Server?

Update lock (U) is used to avoid deadlocks. Unlike the Exclusive lock, the Update lock places a Shared lock on a resource that already has another shared lock on it.


1 Answers

In PostgreSQL the rows will be locked as they are updated -- in fact, the way this actually works is that each tuple (version of a row) has a system field called xmin to indicate which transaction made that tuple current (by insert or update) and a system field called xmax to indicate which transaction expired that tuple (by update or delete). When you access data, it checks each tuple to determine whether it is visible to your transaction, by checking your active "snapshot" against these values.

If you are executing an UPDATE and a tuple which matches your search conditions has an xmin which would make it visible to your snapshot and an xmax of an active transaction, it blocks, waiting for that transaction to complete. If the transaction which first updated the tuple rolls back, your transaction wakes up and processes the row; if the first transaction commits, your transaction wakes up and takes action depending on the current transaction isolation level.

Obviously, a deadlock is the result of this happening to rows in different order. There is no row-level lock in RAM which can be obtained for all rows at the same time, but if rows are updated in the same order you can't have the circular locking. Unfortunately, the suggested IN(1, 2) syntax doesn't guarantee that. Different sessions may have different costing factors active, a background "analyze" task may change statistics for the table between the generation of one plan and the other, or it may be using a seqscan and be affected by the PostgreSQL optimization which causes a new seqscan to join one already in progress and "loop around" to reduce disk I/O.

If you do the updates one at a time in the same order, in application code or using a cursor, then you will have only simple blocking, not deadlocks. In general, though, relational databases are prone to serialization failures, and it is best to access them through a framework which will recognize them based on SQLSTATE and automatically retry the entire transaction from the start. In PostgreSQL a serialization failure will always have a SQLSTATE of 40001 or 40P01.

http://www.postgresql.org/docs/current/interactive/mvcc-intro.html

like image 116
kgrittn Avatar answered Sep 25 '22 17:09

kgrittn