Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

postgres update after select

I want to do the following in one go:

SELECT * FROM jobs WHERE status='PENDING';
UPDATE jobs SET status='RUNNING' WHERE status='PENDING';

So get all pending jobs, then set them as 'RUNNING' immediately after.

The reason I don't want to do it one after the other in two statements is that jobs could be added to the jobs table as 'PENDING' after the SELECT but before the UPDATE so I'd end up setting jobs as RUNNING even though I haven't grabbed it while it was in it's PENDING state.

Is there anyway to do this in one? So I want the result from SELECT and the UPDATE to happen on the fly.

Thanks.

like image 597
tbh1 Avatar asked Jul 20 '11 14:07

tbh1


People also ask

How UPDATE works in Postgres?

PostgreSQL implements multiversioning by keeping the old version of the table row in the table – an UPDATE adds a new row version (“tuple”) of the row and marks the old version as invalid. In many respects, an UPDATE in PostgreSQL is not much different from a DELETE followed by an INSERT .

How do I change the value of a column in Postgres?

Syntax. UPDATE table_name SET column1 = value1, column2 = value2...., columnN = valueN WHERE [condition];

What is Upsert in PostgreSQL?

The UPSERT statement is a DBMS feature that allows a DML statement's author to either insert a row or if the row already exists, UPDATE that existing row instead. That is why the action is known as UPSERT (simply a mix of Update and Insert).


3 Answers

Why not use the RETURNING clause and process both things in one single statement:

UPDATE jobs 
    SET status='RUNNING' 
WHERE status='PENDING'
RETURNING *

That way you will get all rows that were changed by the UPDATE with a single atomic operation.

like image 75
a_horse_with_no_name Avatar answered Oct 02 '22 18:10

a_horse_with_no_name


In general, you should do it with one UPDATE statement. The UPDATE will normally not be affected by rows that could have changed while the UPDATE statement is running, however, it's good to read up on transaction isolation levels here.

Assuming you're using default setting of Read Committed, here is what it says:

Read Committed is the default isolation level in PostgreSQL. When a transaction runs on this isolation level, a SELECT query sees only data committed before the query began;

And in regards to UPDATE:

UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands behave the same as SELECT in terms of searching for target rows: they will only find target rows that were committed as of the command start time. However, such a target row may have already been updated (or deleted or locked) by another concurrent transaction by the time it is found. In this case, the would-be updater will wait for the first updating transaction to commit or roll back (if it is still in progress). If the first updater rolls back, then its effects are negated and the second updater can proceed with updating the originally found row. If the first updater commits, the second updater will ignore the row if the first updater deleted it, otherwise it will attempt to apply its operation to the updated version of the row. The search condition of the command (the WHERE clause) is re-evaluated to see if the updated version of the row still matches the search condition. If so, the second updater proceeds with its operation, starting from the updated version of the row. (In the case of SELECT FOR UPDATE and SELECT FOR SHARE, that means it is the updated version of the row that is locked and returned to the client.)

So in your scenario, one UPDATE should be fine.

Keep in mind too, that there is what's known as a SELECT FOR UPDATE statement, which will lock the rows you select. You can read about that here. A scenario where you would need to use this feature would be in a reservation system. Consider this example:

  1. Execute SELECT to find out if room XYZ is available for a reservation on date X.
  2. The room is available. Execute UPDATE query to book the room.

Do you see the potential problem here? If between steps 1 and 2 the room gets booked by another transaction, then when we reach step 2 we are operating on an assumption which is no longer valid, namely, that the room is available.

However, if in step 1 we use the SELECT FOR UPDATE statement instead, we ensure that no other transaction can lock that row, so when we go to UPDATE the row, we know it's safe to do so.

But again, in your scenario, this SELECT FOR UPDATE isn't needed, because you are doing everything in one statement and aren't checking anything ahead of time.

like image 27
dcp Avatar answered Oct 02 '22 18:10

dcp


begin;
select * 
from jobs 
where status='pending'
for update
;
update jobs 
set status='running' 
where status='pending';
commit;
like image 43
Clodoaldo Neto Avatar answered Oct 02 '22 19:10

Clodoaldo Neto