Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which locking scheme and isolation level should one use for sequence number generation?

Tags:

mysql

I would like to know the general practice used in the industry for generating sequence numbers.

i.e. Get the max from a table. Increment it and store it back.

In order for this to work, which isolation level and/or locking scheme should be used.

I thought serializable should work fine. But it only prevents updates to a table. Selection can still be done. So, the value that would be updated could be same. How can we avoid this?

Thanks!

like image 577
user355562 Avatar asked Feb 27 '23 14:02

user355562


2 Answers

Anything you do within transaction scope is subject to race conditions.

So any SQL query you do to get the last used value, increment it, and store it in a new row means that two concurrent clients could fetch the same value and try to use it, resulting in a duplicate key.

There are a few solutions to this:

  1. Locking. Each client sets an exclusive lock on the rows they read if you use SELECT ... FOR UPDATE (as @Daniel Vassallo describes)

  2. Use auto-increment. This mechanism guarantees no race conditions, because allocation of new values happens without regard to transaction scope. As a benefit, no two concurrent clients will get the same value. This means, though, that a rollback doesn't undo allocation of a value. The LAST_INSERT_ID() function returns the last auto-increment value allocated by the current session, even if other concurrent clients are also generating values in the same table or different tables.

  3. Use an external solution. Generate primary key values not using SQL but with some other system in your application. You're responsible for protecting against race conditions. For instance you could use a counting semaphore.

  4. Use a pseudorandom, unique id. Primary keys need to be unique, but they don't need to be monotonically increasing integers. Some people use the UUID() function to generate a random 128-bit number that's virtually guaranteed to not have duplicates. But then your primary keys have to use a larger data type such as CHAR(36) or BINARY(16) and it's inconvenient to write ad hoc queries.

    SELECT * FROM MyTable WHERE id = '6ccd780c-baba-1026-9564-0040f4311e29';

You mention in a comment that you "read some negative things" about using auto-increment. Of course any feature in any language has do's and don'ts. It doesn't mean we shouldn't use those features -- it means we should learn how to use them properly.

Can you describe your concerns or any of the negative things about auto-increment? Perhaps folks on this thread can address them.

like image 88
Bill Karwin Avatar answered Apr 26 '23 10:04

Bill Karwin


Note that using the REPEATABLE READ isolation level, the default one for InnoDB, you can simply use the SELECT ... FOR UPDATE syntax, as follows:

Test schema:

CREATE TABLE your_table (id int) ENGINE=INNODB;
INSERT INTO your_table VALUES (1), (2), (3);

Then we can do the following:

START TRANSACTION;

SELECT @x := MAX(id) FROM your_table FOR UPDATE;

+---------------+
| @x := MAX(id) |
+---------------+
|             3 |
+---------------+
1 row in set (0.00 sec)

Without committing the transaction, we start another separate session, and do the same:

START TRANSACTION;

SELECT MAX(id) FROM your_table FOR UPDATE;

The database will wait until the lock set in the previous session is released before running this query.

Therefore switching to the previous session, we can insert the new row and commit the transaction:

INSERT INTO your_table VALUES (@x + 1);

COMMIT;

After the first session commits the transaction, the lock will be lifted, and the query in the second session is returned:

+---------+
| MAX(id) |
+---------+
|       4 |
+---------+
1 row in set (8.19 sec)
like image 39
Daniel Vassallo Avatar answered Apr 26 '23 11:04

Daniel Vassallo