How SELECT FOR UPDATE in PostgreSQL work?

Question

I run the below query concurrently in pgAdmin to see how SELECT FOR UPDATE run.

The logic is that there will be a limit for number of users for each team. And if the number reach the limit, the insert will be abort.

The script is as below. pg_sleep is to give me time execute the script 2 before script 1 finish.

DO $$
declare
    team_user_count INT;

BEGIN
    RAISE NOTICE 'Insert start at %', timeofday();
    
    team_user_count := COUNT(user_id) FROM (select user_id from mst_user_team where team_id = 1 FOR UPDATE ) mut;

    if team_user_count >= 30 then
        RAISE NOTICE 'Not accept insert. team_user_count=%', team_user_count;
    else
        RAISE NOTICE 'Accept insert. team_user_count=%', team_user_count;    
        PERFORM pg_sleep(10);    
        -- insert script here
    end if;

    RAISE NOTICE 'Insert done at %', timeofday();

END$$

At the start, the number of user will be 29. 2 scripts run concurrently and both successfully insert the data. The notice both indicated that the current number of users is 29. Why is this happen?

My thought is that when script 2 will wait until script 1 end, then execute the select for update query to get the number 30. But, apparently, it already get the data, and just wait until script 1 finish to show the result.

Is my understand correct?

Laurenz Albe · Accepted Answer

Like all SQL statements, SELECT ... FOR UPDATE acquires a snapshot when it starts, that is, it determines which transactions are visible and which aren't. Any rows that you INSERT later on aren't visible to the statement, even if they were inserted before the SELECT completed.

There is only one exception where a statement can see data that were added after its snapshot was taken: If a statement is blocked by a lock, it waits for the lock to be relinquished, and then it fetches the latest committed version of the row and checks again if that matches the query condition. This applies only to rows that were updated, but not rows that were inserted, which leads to interesting anomalies.

Your case is a variant of that: one of the concurrent transactions (let's call it transaction 1) grabs the lock on the rows, while the other (transaction 2) is blocked. Transaction 1 then inserts rows and commits. Transaction 2 fetches the latest versions of the locked rows and sees that they didn't change, but it does not see the new rows that were inserted by transaction 1. So the test is passed, and transaction 2 also inserts some rows, violating the desired constraint.

There is no way to guarantee a constraint like yours, unless you either use the SERIALIZABLE transaction isolation level or serialize the transactions (for example with an advisory lock).

How SELECT FOR UPDATE in PostgreSQL work?

Tags:

sql

postgresql

N.D.H.Vu

1 Answers

Laurenz Albe

Recent Activity

Donate For Us

How SELECT FOR UPDATE in PostgreSQL work?

Tags:

sql

postgresql

N.D.H.Vu

1 Answers

Laurenz Albe

Related questions

Recent Activity

Donate For Us