Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Wrong PostgreSQL query results with explicit locks and concurrent transaction

While writing some SQL queries for PostgreSQL, I discovered some unusual behaviour, which I find slightly disturbing.

Say we have the following table "test":

+----+-------+---------------------+
| id | value |     created_at      |
+----+-------+---------------------+
|  1 | A     | 2014-01-01 00:00:00 |
|  2 | A     | 2014-01-02 00:00:00 |
|  3 | B     | 2014-01-03 00:00:00 |
|  4 | B     | 2014-01-04 00:00:00 |
|  5 | A     | 2014-01-05 00:00:00 |
|  6 | B     | 2014-01-06 00:00:00 |
|  7 | A     | 2014-01-07 00:00:00 |
|  8 | B     | 2014-01-08 00:00:00 |
+----+-------+---------------------+

There are two transactions, A and B, running in parallel.

A: begin;           /* Begin transaction A */
B: begin;           /* Begin transaction B */
A: select * from test where id = 1 for update; /* Lock one row */
B: select * from test where value = 'B' order by created_at limit 3 for update; /* This query returns immediately since it does not need to return row with id=1 */
B: select * from test where value = 'A' order by created_at limit 3 for update; /* This query blocks because row id=1 is locked by transaction A */
A: update test set created_at = '2014-01-09 00:00:00' where id = 1; /* Modify the locked row */
A: commit;

As soon as transaction A commits and releases the row with id=1, the blocked query of transaction B returns the following result:

+----+-------+---------------------+
| id | value |     created_at      |
+----+-------+---------------------+
|  1 | A     | 2014-01-09 00:00:00 |
|  2 | A     | 2014-01-02 00:00:00 |
|  5 | A     | 2014-01-05 00:00:00 |
+----+-------+---------------------+

These rows are most certainly not ordered by "created_at" and row with id=1 shouldn't even be among the returned rows. The fact that transactions A and B were running concurrently, has resulted in wrong results in transaction B, which would not have happened if the transactions had been executed one after the other. This seems like a violation of transaction isolation.

Is this a bug?

If this is not a bug and these results are expected, what does this mean in terms of the reliability of results returned by the DB? If I had a highly concurrent environment and the subsequent code relied on the rows actually being ordered by date, there would be errors.

If, however, we run the same sequence of instructions as above, but replace the update statement with the following:

update test set value = 'B', created_at = '2014-01-09 00:00:00' where id = 1;

... then the blocked query returns the correct result:

+----+-------+---------------------+
| id | value |     created_at      |
+----+-------+---------------------+
|  2 | A     | 2014-01-02 00:00:00 |
|  5 | A     | 2014-01-05 00:00:00 |
|  7 | A     | 2014-01-07 00:00:00 |
+----+-------+---------------------+

In this case, does the blocked query get executed twice since its initial result gets invalidated?

I'm most interested in PostgreSQL, but I would also like to know if this is the case with other RDBMS that support row-level locking, such as Oracle, SQL Server and MySQL.

like image 855
Jaan Avatar asked Oct 06 '14 16:10

Jaan


1 Answers

There are a couple of things going on here. First, this is documented behavior. Second, you don't see the whole story, because you didn't try to update anything in session "B".

This seems like a violation of transaction isolation.

Depends on what isolation level you're running at. PostgreSQL's default transaction isolation level is READ COMMITTED.

This is documented behavior in PostgreSQL.

It is possible for a SELECT command running at the READ COMMITTED transaction isolation level and using ORDER BY and a locking clause to return rows out of order. This is because ORDER BY is applied first. The command sorts the result, but might then block trying to obtain a lock on one or more of the rows. Once the SELECT unblocks, some of the ordering column values might have been modified, leading to those rows appearing to be out of order (though they are in order in terms of the original column values).

One workaround (also documented, same link) is to move the FOR UPDATE into a subquery, but this requires a table lock.

To see what PostgreSQL really does in this situation, run an update in session "B".

create table test (
  id integer primary key,
  value char(1) not null,
  created_at timestamp not null
);
insert into test values
(1, 'A', '2014-01-01 00:00:00'),
(2, 'A', '2014-01-02 00:00:00'),
(3, 'B', '2014-01-03 00:00:00'),
(4, 'B', '2014-01-04 00:00:00'),
(5, 'A', '2014-01-05 00:00:00'),
(6, 'B', '2014-01-06 00:00:00'),
(7, 'A', '2014-01-07 00:00:00'),
(8, 'B', '2014-01-08 00:00:00');
A: begin;           /* Begin transaction A */
B: begin;           /* Begin transaction B */
A: select * from test where id = 1 for update; /* Lock one row */
B: select * from test where value = 'B' order by created_at limit 3 for update; /* This query returns immediately since it does not need to return row with id=1 */
B: select * from test where value = 'A' order by created_at limit 3 for update; /* This query blocks because row id=1 is locked by transaction A */
A: update test set created_at = '2014-01-09 00:00:00' where id = 1; /* Modify the locked row */
A: commit;
B: update test set value = 'C' where id in (select id from test where value = 'A' order by created_at limit 3); /* Updates 3 rows */
B: commit;

Now, look at the table.

scratch=# select * from test order by id;
 id | value |     created_at      
----+-------+---------------------
  1 | A     | 2014-01-09 00:00:00
  2 | C     | 2014-01-02 00:00:00
  3 | B     | 2014-01-03 00:00:00
  4 | B     | 2014-01-04 00:00:00
  5 | C     | 2014-01-05 00:00:00
  6 | B     | 2014-01-06 00:00:00
  7 | C     | 2014-01-07 00:00:00
  8 | B     | 2014-01-08 00:00:00

Session "A" succeeded in updating the row having id 1 to '2014-01-09'. Session "B" succeeded in updating the three remaining rows whose value was 'A'. The update statement obtained locks on id numbers 2, 5, and 7; we know that because those were the rows actually updated. The earlier select statement locked different rows--rows 1, 2, and 5.

You can block session B's update if you start a third terminal session, and lock row 7 for update.

like image 56
Mike Sherrill 'Cat Recall' Avatar answered Oct 11 '22 07:10

Mike Sherrill 'Cat Recall'