I'm having a deadlock and I'm trying to figure out the reasoning behind it.
The question can be reduced to this:
table:
create table testdl (id int auto_increment, c int, primary key (id), key idx_c (c));
Isolation level is repeatable read
(Tx1): begin; delete from testdl where c = 1000; -- nothing is deleted coz the table is empty
(Tx2): begin; insert into testdl (c) values (?);
Whatever the value in Tx2 is, it hangs. So it basically means that Tx1 holds the gap of the whole range (-∞, +∞), when delete from testdl where c = 1000
fails to find a match, right?.
So my question is: is this by design? What's the point of this if it is?
Update:
Say we already have a record in testdl
:
+----+------+
| id | c |
+----+------+
| 1 | 1000 |
+----+------+
Case 1:
(Tx1): select * from testdl where c = 500 for update; -- c = 500 not exists
(TX2): insert into testdl (c) values (?);
In this case, any value >= 1000 can be inserted, so Tx1 locks the gap (-∞, 1000)
Again, is locking (-∞, 1000) necessary? What's the reasoning behind this?
The most common reason implicit locks are created is an INSERT operation: successfully inserted rows are not visible to other transactions until the inserting transaction commits, and it is a common situation that a single transaction inserts many rows, so it is cheaper to not create explicit locks for newly inserted ...
Gap locking can be disabled explicitly. This occurs if you change the transaction isolation level to READ COMMITTED. Under these circumstances, gap locking is disabled for searches and index scans and is used only for foreign-key constraint checking and duplicate-key checking.
TiDB now implements both pessimistic and optimistic concurrency control mechanisms, which means: Transaction commits in TiDB won't fail due to locking issues except deadlocks. MySQL users can use TiDB more easily. MySQL supports pessimistic locking by default.
This is similar to what I was curious myself recently so let me try to explain...
Whatever the value in Tx2 is, it hangs. So it basically means that Tx1 holds the gap of the whole range (-∞, +∞), when delete from testdl where c = 1000 fails to find a match, right?.
So my question is: is this by design? What's the point of this if it is?
This is by design, the main point of gap locks is to prevent any records inserts into these gaps to avoid phantom rows
.
So, imagine you have your empty table and inside of a transaction you do delete from testdl where c = 1000;
. Now, no matter how many such rows existed before you expect that after this query you have no such rows in your table, right? So, if after that you do select * from testdl where c = 1000 for update;
in the same transaction you expect it to be an empty result.
But in order to make sure there are no new rows with c = 1000
inserted into the table we need to lock the gaps where such records can be inserted. And in an empty table there is only one gap: the gap between the infimum and supremum pseudo-records (as Michael pointed out).
In this case, any value >= 1000 can be inserted, so Tx1 locks the gap (-∞, 1000)
Again, is locking (-∞, 1000) necessary? What's the reasoning behind this?
I believe the above explanation should also explain the questions you ask about your second case when there is already one record in the table. But I'll try to explain it anyway.
In your first transaction you do select * from testdl where c = 500 for update;
and now we need to make sure there are no new records with c = 500
appear if we decide to make such query again inside of this transaction. So we need to lock all the necessary gaps for it. Which gaps do we have? (-∞, 1000)
and (1000, +∞)
, obviously new records where c = 500
won't be inserted into the second gap but they will be inserted into the first gap, so we have to lock it.
Hope this answers it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With