Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can AUTO_INCREMENT be safely used in a BEFORE TRIGGER in MySQL

Instagram's Postgres method of implementing custom Ids for Sharding is great, but I need the implementation in MySQL.

So, I converted the method found at the bottom of this blog, here: http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram

MySQL Version:

CREATE TRIGGER shard_insert BEFORE INSERT ON tablename
FOR EACH ROW BEGIN

DECLARE seq_id BIGINT;
DECLARE now_millis BIGINT;
DECLARE our_epoch BIGINT DEFAULT 1314220021721;
DECLARE shard_id INT DEFAULT 1;

SET now_millis = (SELECT UNIX_TIMESTAMP(NOW(3)) * 1000);
SET seq_id = (SELECT AUTO_INCREMENT FROM information_schema.TABLES WHERE TABLE_SCHEMA = "dbname" AND TABLE_NAME = "tablename");
SET NEW.id = (SELECT ((now_millis - our_epoch) << 23) | (shard_id << 10) | (SELECT MOD(seq_id, 1024)));
END

The table looks roughly like this:

CREATE TABLE tablename (
    id BIGINT AUTO_INCREMENT,
    ...
)

Question:

  1. There is a concurrency problem here. When spawning 100 threads and running inserts, I am getting duplicate sequence values, meaning two triggers are getting the same auto_increment value. How can I fix this?

I tried creating a new table, e.g. "tablename_seq", with one row, a counter to store my own auto_increment values, then doing updates to that table inside the TRIGGER, but the problem is I can't LOCK the table in a Stored Procedure (trigger), so I have the exact same problem, I can't guarantee a counter to be unique between triggers :(.

I'm stumped and really would appreciate any tips!

Possible Solution:

  1. MySQL 5.6 has UUID_SHORT() which generates unique incrementing values which are guaranteed to be unique. It appears in practice when calling this that each call increments the value +1. By using: SET seq_id = (SELECT UUID_SHORT()); it appears to remove the concurrency problem. The side effect of this is that now (roughly) no more than 1024 inserts can happen per millisecond in the entire system. If more do, then it's possible for a DUPLICATE PRIMARY KEY error. The good news is that in benchmarks on my machine, I get ~3,000 inserts/s with or wtihout the trigger contianing UUID_SHORT(), so it doesn't appear to slow it down at all.
like image 304
jsidlosky Avatar asked Sep 05 '14 01:09

jsidlosky


People also ask

Can we use auto increment without primary key?

There can be only one AUTO_INCREMENT column per table, it must be indexed, and it cannot have a DEFAULT value. So you can indeed have an AUTO_INCREMENT column in a table that is not the primary key.

Is it possible to set an AUTO_INCREMENT field value manually?

The default value is Yes. If you want to manually assign a value to a field that has the AutoIncrement property set to Yes, you must be member of the SQL Server db_owner database permission set.

Can a primary key be AUTO_INCREMENT?

Auto-increment allows a unique number to be generated automatically when a new record is inserted into a table. Often this is the primary key field that we would like to be created automatically every time a new record is inserted.

Should you use auto increment in SQL?

Auto-increment should be used as a unique key when no unique key already exists about the items you are modelling. So for Elements you could use the Atomic Number or Books the ISBN number.


2 Answers

The following SQL Fiddle generates an output as shown below:

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 45
Server version: 5.5.35-1

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> select `id` from `tablename`;
+-------------------+
| id                |
+-------------------+
| 11829806563853313 |
| 11829806563853314 |
| 11829806563853315 |
| 11829806563853316 |
| 11829806563853317 |
| 11829806563853318 |
| 11829806563853319 |
| 11829806563853320 |
| 11829806563853321 |
| 11829806563853322 |
| 11829806563853323 |
| 11829806563853324 |
| 11829806563853325 |
| 11829806563853326 |
| 11829806563853327 |
| 11829806563853328 |
| 11829806563853329 |
| 11829806563853330 |
| 11829806563853331 |
| 11829806563853332 |
| 11829806563853333 |
| 11829806563853334 |
| 11829806563853335 |
| 11829806563853336 |
| 11829806563853337 |
| 11829806563853338 |
| 11829806563853339 |
| 11829806563853340 |
| 11829806563853341 |
| 11829806563853342 |
| 11829806563853343 |
| 11829806563853344 |
| 11829806563853345 |
| 11829806563853346 |
| 11829806563853347 |
| 11829806563853348 |
| 11829806563853349 |
| 11829806563853350 |
| 11829806563853351 |
| 11829806563853352 |
+-------------------+
40 rows in set (0.01 sec)

Accept the answer if it really solves your need.

UPDATE

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 46
Server version: 5.5.35-1

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> DELIMITER //

mysql> DROP FUNCTION IF EXISTS `nextval`//
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> DROP TRIGGER IF EXISTS `shard_insert`//
Query OK, 0 rows affected (0.00 sec)

mysql> DROP TABLE IF EXISTS `tablename_seq`, `tablename`;
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE TABLE `tablename_seq` (
    ->   `seq` BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY
    -> )//
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE TABLE `tablename` (
    ->   `id` BIGINT UNSIGNED PRIMARY KEY
    -> )//
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE FUNCTION `nextval`()
    -> RETURNS BIGINT UNSIGNED
    -> DETERMINISTIC
    -> BEGIN
    ->   DECLARE `_last_insert_id` BIGINT UNSIGNED;
    ->   INSERT INTO `tablename_seq` VALUES (NULL);
    ->   SET `_last_insert_id` := LAST_INSERT_ID();
    ->   DELETE FROM `tablename_seq`
    ->   WHERE `seq` = `_last_insert_id`;
    ->   RETURN `_last_insert_id`;
    -> END//
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE TRIGGER `shard_insert` BEFORE INSERT ON `tablename`
    -> FOR EACH ROW
    -> BEGIN
    ->   DECLARE `seq_id`, `now_millis` BIGINT UNSIGNED;
    ->   DECLARE `our_epoch` BIGINT UNSIGNED DEFAULT 1314220021721;
    ->   DECLARE `shard_id` INT UNSIGNED DEFAULT 1;
    ->   SET `now_millis` := `our_epoch` + UNIX_TIMESTAMP();
    ->   SET `seq_id` := `nextval`();
    ->   SET NEW.`id` := (SELECT (`now_millis` - `our_epoch`) << 23 |
    ->                            `shard_id` << 10 |
    ->                            MOD(`seq_id`, 1024)
    ->                   );
    -> END//
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO `tablename`
    -> VALUES
    -> (0), (0), (0), (0), (0),
    -> (0), (0), (0), (0), (0),
    -> (0), (0), (0), (0), (0),
    -> (0), (0), (0), (0), (0),
    -> (0), (0), (0), (0), (0),
    -> (0), (0), (0), (0), (0),
    -> (0), (0), (0), (0), (0),
    -> (0), (0), (0), (0), (0)//
Query OK, 40 rows affected (0.00 sec)
Records: 40  Duplicates: 0  Warnings: 0

mysql> DELIMITER ;

mysql> SELECT `id` FROM `tablename`;
+-------------------+
| id                |
+-------------------+
| 12581084357198849 |
| 12581084357198850 |
| 12581084357198851 |
| 12581084357198852 |
| 12581084357198853 |
| 12581084357198854 |
| 12581084357198855 |
| 12581084357198856 |
| 12581084357198857 |
| 12581084357198858 |
| 12581084357198859 |
| 12581084357198860 |
| 12581084357198861 |
| 12581084357198862 |
| 12581084357198863 |
| 12581084357198864 |
| 12581084357198865 |
| 12581084357198866 |
| 12581084357198867 |
| 12581084357198868 |
| 12581084357198869 |
| 12581084357198870 |
| 12581084357198871 |
| 12581084357198872 |
| 12581084357198873 |
| 12581084357198874 |
| 12581084357198875 |
| 12581084357198876 |
| 12581084357198877 |
| 12581084357198878 |
| 12581084357198879 |
| 12581084357198880 |
| 12581084357198881 |
| 12581084357198882 |
| 12581084357198883 |
| 12581084357198884 |
| 12581084357198885 |
| 12581084357198886 |
| 12581084357198887 |
| 12581084357198888 |
+-------------------+
40 rows in set (0.00 sec)

See db-fiddle.

like image 138
wchiquito Avatar answered Oct 08 '22 01:10

wchiquito


An alternative is to grab blocks of auto increment numbers. If you set MySQLs auto increment increment to something like 1000, a process can do an insert in the "sequence" table and get the auto increment value. The process then knows that it has 1000 sequential numbers it can use, starting at that number, that will be free of conflicts. There is no need to record every increment in a central table if all you are recording is a number.

This is most commonly used in multiple master setups in addition to the auto increment offset. You could go the multiple master route too, and insert on the different masters. The auto increment increment and offset would assure no conflicts. This would require solid knowledge of MySQL replication.

like image 27
Brent Baisley Avatar answered Oct 08 '22 01:10

Brent Baisley