Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Order of locking in postgres SELECT FOR UPDATE

Tags:

postgresql

Assuming both table a and b has a single row, the query

 SELECT * FROM a, b FOR UPDATE

should get two row-level locks (one on a, one on b). Is there any defined ordering in which the locks are acquired? And is there any way to require the lock from the table b to get acquired lock before the one from a (to avoid deadlocking with other transactions)?

like image 779
Heptic Avatar asked Mar 13 '15 20:03

Heptic


1 Answers

Is there any defined ordering in which the locks are acquired?

Not that I know of, for SELECT * anyway. Since no lock ordering is documented for this case, you cannot rely on it even if in practice it exists. It might change in a future version.

And is there any way to require the lock from the table b to get acquired lock before the one from a (to avoid deadlocking with other transactions)?

If you must use SELECT * then no. But if you can control the SELECT-list, yes. It looks like row locks get acquired in the order the relevant tuple fields appear in the SELECT list, so:

SELECT a.x, b.x FROM b, a FOR UPDATE;

will acquire a lock on the row from a then on the row from b. At the moment, anyway; I don't think the standard specifies this and don't see any reference to it in the documentation, so this could change later.

Personally ... I'd use a DO block or separate queries. It might be possible to do it with some subqueries or a CTE, but you'd have to create some form of artificial dependency between them to ensure ordering. Fragile and not worth it.


Lets look at what actually happens:

regress=> EXPLAIN (VERBOSE) SELECT * FROM a, b FOR UPDATE;
                                  QUERY PLAN                                   
-------------------------------------------------------------------------------
 LockRows  (cost=0.00..129674.00 rows=5760000 width=20)
   Output: a.x, b.x, a.ctid, b.ctid
   ->  Nested Loop  (cost=0.00..72074.00 rows=5760000 width=20)
         Output: a.x, b.x, a.ctid, b.ctid
         ->  Seq Scan on public.a  (cost=0.00..34.00 rows=2400 width=10)
               Output: a.x, a.ctid
         ->  Materialize  (cost=0.00..46.00 rows=2400 width=10)
               Output: b.x, b.ctid
               ->  Seq Scan on public.b  (cost=0.00..34.00 rows=2400 width=10)
                     Output: b.x, b.ctid
(10 rows)

The query gets executed, then the results get fed into a LockRows node. What does LockRows do? For that, it's time to go source code diving.

ExecLockRows in src/backend/executor/nodeLockRows.c is the relevant code. There's a lot there, but the gist of it is that it iterates in order over a list of RowMarks and acquires each lock in order. The list is set up by ExecInitLockRows, where it copies and filters the list prepared during planning and stored in the LockRows node.

I don't have time to trace back in the planner to find the ordering of LockRows creation, but IIRC it's basically just the parse order (for SELECT *) or the order in which the fields appear in the SELECT list (if you didn't use *). I recommend not relying on it.

like image 53
Craig Ringer Avatar answered Oct 25 '22 18:10

Craig Ringer