I can't find any details on optimistic locking in MySQL. I read that starting a transaction keep updates on two entities synced, however - it doesn't stop two users updating the data at the same time causing a conflict.
Apparently optimistic locking will solve this issue? How is this applied in MySQL. Is there SQL syntax / keyword for this? Or does MySQL have default behavior?
Thanks guys.
There are two models for locking data in a database: Optimistic locking , where a record is locked only when changes are committed to the database. Pessimistic locking , where a record is locked while it is edited.
Optimistic locking is a technique for SQL database applications that does not hold row locks between selecting and updating or deleting a row. The application is written to optimistically assume that unlocked rows are unlikely to change before the update or delete operation.
In order to use optimistic locking, we need to have an entity including a property with @Version annotation. While using it, each transaction that reads data holds the value of the version property. Before the transaction wants to make an update, it checks the version property again.
MySQL supports pessimistic locking by default. Now TiDB also supports pessimistic locking, so MySQL users don't need to modify much application code to get started with TiDB.
The point is that Optimistic Locking is not a database feature, not for MySQL nor for others: optimistic locking is a practice that is applied using the DB with standard instructions.
Let's have a very simple example and say that you want to do this in a code that multiple users/clients can run concurrently:
NOTE: all code {between curly brackets} is intended to be in the app code and not (necessarily) in the SQL side
- SELECT iD, val1, val2 FROM theTable WHERE iD = @theId; - {code that calculates new values} - UPDATE theTable SET val1 = @newVal1, val2 = @newVal2 WHERE iD = @theId; - {go on with your other code}
- SELECT iD, val1, val2 FROM theTable WHERE iD = @theId; - {code that calculates new values} - UPDATE theTable SET val1 = @newVal1, val2 = @newVal2 WHERE iD = @theId AND val1 = @oldVal1 AND val2 = @oldVal2; - {if AffectedRows == 1 } - {go on with your other code} - {else} - {decide what to do since it has gone bad... in your code} - {endif}
Note that the key point is in the structure of the UPDATE instruction and the subsequent number of affected rows check. It is these two things together that let your code realize that someone has already modified the data in between when you have executed the SELECT and UPDATE. Notice that all has been done without transactions! This has been possible (absence of transactions) only because this is a very simple example but this tells also that the key point for Optimistic locking is not in transactions themselves.
- SELECT iD, val1, val2 FROM theTable WHERE iD = @theId; - {code that calculates new values} - BEGIN TRANSACTION; - UPDATE anotherTable SET col1 = @newCol1, col2 = @newCol2 WHERE iD = @theId; - UPDATE theTable SET val1 = @newVal1, val2 = @newVal2 WHERE iD = @theId AND val1 = @oldVal1 AND val2 = @oldVal2; - {if AffectedRows == 1 } - COMMIT TRANSACTION; - {go on with your other code} - {else} - ROLLBACK TRANSACTION; - {decide what to do since it has gone bad... in your code} - {endif}
This last example shows that if you check for collisions at some point and discover a collision has happened when you have already modified other tables/rows.. ..then with transactions you are able to rollback ALL the changes that you've done since the beginning. Obviously it is up to you (that knows what your application is doing) to decide how large the amount of operations to rollback is for each possible collision and based on this decide where to put the transactions boundaries and where to check for collisions with the special UPDATE + AffectedRows check.
In this case with transactions we have separated the moment when we perform the UPDATE from the moment when it is committed. So what happens when an "other process" performs an update in this time frame? To know what happens exactly requires delving into the details of isolation levels (and how they are managed on each engine). As an example in the case of Microsoft SQL Server with READ_COMMITTED the updated rows are locked until the COMMIT, so "other process" can't do nothing (is kept waiting) on that rows, neither a SELECT (in fact it can only READ_COMMITTED). So since the "other process" activity is deferred it's UPDATE will fail.
- SELECT iD, val1, val2, version FROM theTable WHERE iD = @theId; - {code that calculates new values} - UPDATE theTable SET val1 = @newVal1, val2 = @newVal2, version = version + 1 WHERE iD = @theId AND version = @oldversion; - {if AffectedRows == 1 } - {go on with your other code} - {else} - {decide what to do since it has gone bad... in your code} - {endif}
Here it is shown that instead of checking if the value is still the same for all the fields we can use a dedicated field (that is modified each time we do an UPDATE) to see if anyone was quicker than us and changed the row between our SELECT and UPDATE. Here the absence of transactions is due to the simplicity as in the first example and is not related with the version column use. Again this column use is up to the implementation in the application code and not a database engine feature.
More than this there are other points which I think would make this answer too long (is already much too long) so I only mention them by now with some references:
Since the isolation level value and implementation may be different the best advice (as usual in this site) is to perform a test on the used platform / environment.
It may seem difficult but in reality it can be done quite easily from any DB development environment using two separate windows and starting on each one a transaction then executing the commands one by one.
At some point you will see that the the command execution continues indefinitely. Then when on the other window it is called COMMIT or ROLLBACK it completes the execution.
Here are some very basic commands ready to be tested as just described.
Use these for creating the table and one useful row:
CREATE TABLE theTable( iD int NOT NULL, val1 int NOT NULL, val2 int NOT NULL ); INSERT INTO theTable (iD, val1, val2) VALUES (1, 2 ,3);
Then the following on two different windows and step by step:
BEGIN TRAN SELECT val1, val2 FROM theTable WHERE iD = 1; UPDATE theTable SET val1=11 WHERE iD = 1 AND val1 = 2 AND val2 = 3; COMMIT TRAN
Then change the order of commands and order of execution in any order you may think.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With