Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimistic locking in MySQL

I can't find any details on optimistic locking in MySQL. I read that starting a transaction keep updates on two entities synced, however - it doesn't stop two users updating the data at the same time causing a conflict.

Apparently optimistic locking will solve this issue? How is this applied in MySQL. Is there SQL syntax / keyword for this? Or does MySQL have default behavior?

Thanks guys.

like image 217
Green Acorn Avatar asked Jul 02 '13 16:07

Green Acorn


People also ask

What is optimistic locking in MySQL?

There are two models for locking data in a database: Optimistic locking , where a record is locked only when changes are committed to the database. Pessimistic locking , where a record is locked while it is edited.

What is optimistic locking in database?

Optimistic locking is a technique for SQL database applications that does not hold row locks between selecting and updating or deleting a row. The application is written to optimistically assume that unlocked rows are unlikely to change before the update or delete operation.

Where is optimistic locking used?

In order to use optimistic locking, we need to have an entity including a property with @Version annotation. While using it, each transaction that reads data holds the value of the version property. Before the transaction wants to make an update, it checks the version property again.

Does MySQL support pessimistic locking?

MySQL supports pessimistic locking by default. Now TiDB also supports pessimistic locking, so MySQL users don't need to modify much application code to get started with TiDB.


1 Answers

The point is that Optimistic Locking is not a database feature, not for MySQL nor for others: optimistic locking is a practice that is applied using the DB with standard instructions.

Let's have a very simple example and say that you want to do this in a code that multiple users/clients can run concurrently:

  1. SELECT data from a row having one ID field (iD) and two data fields (val1, val2)
  2. optionally do your calculations with data
  3. UPDATE data of that row

The NO LOCKING way to is:

NOTE: all code {between curly brackets} is intended to be in the app code and not (necessarily) in the SQL side

- SELECT iD, val1, val2        FROM theTable        WHERE iD = @theId;  - {code that calculates new values}  - UPDATE theTable        SET val1 = @newVal1,            val2 = @newVal2        WHERE iD = @theId;  - {go on with your other code} 

The OPTIMISTIC LOCKING way is:

- SELECT iD, val1, val2        FROM theTable        WHERE iD = @theId;  - {code that calculates new values}  - UPDATE theTable        SET val1 = @newVal1,            val2 = @newVal2        WHERE iD = @theId            AND val1 = @oldVal1            AND val2 = @oldVal2;  - {if AffectedRows == 1 }  -     {go on with your other code}  - {else}  -     {decide what to do since it has gone bad... in your code}  - {endif} 

Note that the key point is in the structure of the UPDATE instruction and the subsequent number of affected rows check. It is these two things together that let your code realize that someone has already modified the data in between when you have executed the SELECT and UPDATE. Notice that all has been done without transactions! This has been possible (absence of transactions) only because this is a very simple example but this tells also that the key point for Optimistic locking is not in transactions themselves.

What about TRANSACTIONS then?

 - SELECT iD, val1, val2        FROM theTable        WHERE iD = @theId;  - {code that calculates new values}  - BEGIN TRANSACTION;  - UPDATE anotherTable        SET col1 = @newCol1,            col2 = @newCol2        WHERE iD = @theId;  - UPDATE theTable        SET val1 = @newVal1,            val2 = @newVal2        WHERE iD = @theId            AND val1 = @oldVal1            AND val2 = @oldVal2;  - {if AffectedRows == 1 }  -     COMMIT TRANSACTION;  -     {go on with your other code}  - {else}  -     ROLLBACK TRANSACTION;  -     {decide what to do since it has gone bad... in your code}  - {endif} 

This last example shows that if you check for collisions at some point and discover a collision has happened when you have already modified other tables/rows.. ..then with transactions you are able to rollback ALL the changes that you've done since the beginning. Obviously it is up to you (that knows what your application is doing) to decide how large the amount of operations to rollback is for each possible collision and based on this decide where to put the transactions boundaries and where to check for collisions with the special UPDATE + AffectedRows check.

In this case with transactions we have separated the moment when we perform the UPDATE from the moment when it is committed. So what happens when an "other process" performs an update in this time frame? To know what happens exactly requires delving into the details of isolation levels (and how they are managed on each engine). As an example in the case of Microsoft SQL Server with READ_COMMITTED the updated rows are locked until the COMMIT, so "other process" can't do nothing (is kept waiting) on that rows, neither a SELECT (in fact it can only READ_COMMITTED). So since the "other process" activity is deferred it's UPDATE will fail.

The VERSIONING OPTIMISTIC LOCKING option:

 - SELECT iD, val1, val2, version        FROM theTable        WHERE iD = @theId;  - {code that calculates new values}  - UPDATE theTable        SET val1 = @newVal1,            val2 = @newVal2,            version = version + 1        WHERE iD = @theId            AND version = @oldversion;  - {if AffectedRows == 1 }  -     {go on with your other code}  - {else}  -     {decide what to do since it has gone bad... in your code}  - {endif} 

Here it is shown that instead of checking if the value is still the same for all the fields we can use a dedicated field (that is modified each time we do an UPDATE) to see if anyone was quicker than us and changed the row between our SELECT and UPDATE. Here the absence of transactions is due to the simplicity as in the first example and is not related with the version column use. Again this column use is up to the implementation in the application code and not a database engine feature.

More than this there are other points which I think would make this answer too long (is already much too long) so I only mention them by now with some references:

  • transaction isolation level (here for MySQL) about transaction effect on SELECTs.
  • for the INSERT on tables with primary keys not autogenerated (or unique constraints) it will automatically fail with no need of particular checking if two processes try to insert the same values where it must be unique.
  • if you have no id column (primary key or unique constraints) also a single SELECT + UPDATE require transactions because you could have the surprise that after modifications made by others there are more rows than expected matching the criteria of the UPDATE's WHERE clause.

How to check in practice and get confident

Since the isolation level value and implementation may be different the best advice (as usual in this site) is to perform a test on the used platform / environment.

It may seem difficult but in reality it can be done quite easily from any DB development environment using two separate windows and starting on each one a transaction then executing the commands one by one.

At some point you will see that the the command execution continues indefinitely. Then when on the other window it is called COMMIT or ROLLBACK it completes the execution.

Here are some very basic commands ready to be tested as just described.

Use these for creating the table and one useful row:

CREATE TABLE theTable(     iD int NOT NULL,     val1 int NOT NULL,     val2 int NOT NULL ); INSERT INTO theTable (iD, val1, val2) VALUES (1, 2 ,3); 

Then the following on two different windows and step by step:

BEGIN TRAN  SELECT val1, val2 FROM theTable WHERE iD = 1;  UPDATE theTable   SET val1=11   WHERE iD = 1 AND val1 = 2 AND val2 = 3;  COMMIT TRAN 

Then change the order of commands and order of execution in any order you may think.

like image 162
Diego Mazzaro Avatar answered Sep 24 '22 23:09

Diego Mazzaro