The UUID()
function by itself produces a different value each time it is called, as I would expect it to do:
SELECT UUID() from INFORMATION_SCHEMA.TABLES LIMIT 3;
3bb7d468-f9c5-11e9-8349-d05099466715
3bb7d482-f9c5-11e9-8349-d05099466715
3bb7d492-f9c5-11e9-8349-d05099466715
However, as soon as we use it within the REPLACE()
function, it begins producing the same value:
SELECT REPLACE(UUID(),'-','-') from INFORMATION_SCHEMA.TABLES LIMIT 3;
e0f2d47a-f9c5-11e9-8349-d05099466715
e0f2d47a-f9c5-11e9-8349-d05099466715
e0f2d47a-f9c5-11e9-8349-d05099466715
This 'breaks' Insert From Select statements like this where we expect each inserted row to have a unique value:
INSERT INTO MyTable (uid, tableName) -- uid is binary(16)
SELECT UNHEX(REPLACE(UUID(),'-','')), TABLE_NAME from INFORMATION_SCHEMA.TABLES;
Note, I am using the information schema's list of tables for convenience. It shouldn't matter, but for those that are curious, our PK's are UUIDs in binary(16) form. I can't change that; please don't focus on that.
The UUID()
function is non-deterministic, while the REPLACE()
function is deterministic. I would have expected the non-deterministic characteristic of the UUID()
function to result in the REPLACE()
function behaving as if it had a different argument for each row, but it seems as though the DB engine is over optimizing by assuming the UUID()
to be constant.
I also tested this behavior with another non-deterministic function, RAND()
, and in this case the REPLACE()
function worked as we'd expect!
SELECT REPLACE(RAND(),' ',' ') from INFORMATION_SCHEMA.TABLES LIMIT 3;
0.911571646026868
0.626416072832808
0.6977608461843439
Questions:
Is there a way to perform an "Insert From Select" and generate a unique UUID in binary 16 form per row in the select?
Why is this happening? Is this a bug?
Updates
I am using 5.7.27 locally:
mysql Ver 14.14 Distrib 5.7.27, for Linux (x86_64)
But this will end up deploying to an AWS RDS instance. lol... The Terraform (scripted deploy) spins up an AWS RDS instance with engine version 5.7.16.
Looking in the AWS console, I see support up to version 5.7.26 (in the 5.7 vein) and 8.0.16 (in the 8.0 vein). I'll discuss upgrading the deployed engine version. I'd love to change the PK column definitions to default the values as @Schwern has suggested.
Work Around
Until I can get others to agree to a version change, I'm moving forward by using a temporary table as intermediate storage for generated id values.
CREATE TEMPORARY TABLE GeneratedIds (
generatedId varchar(36) NOT NULL,
tableName text NOT NULL
);
INSERT INTO GeneratedIds (generatedId, tableName)
SELECT UUID(), TABLE_NAME from INFORMATION_SCHEMA.TABLES;
INSERT INTO MyTable (uid, tableName) -- uid is binary(16)
SELECT UNHEX(REPLACE(generatedId,'-','')), tableName FROM GeneratedIds;
DROP TABLE GeneratedIds;
This is not very elegant, but it does work. In my case I am working within a sql migration file where I can string together this kind of sequence of sql in a cohesive manner. I wouldn't recommend doing this in code; it smells.
Conclusion
This does appear to be a bug in MySQL. I did a quick search of their bug DB but I did not find a mention of it. Regardless, the SQL statements above illustrate the defect, and @Schwern and I have shown that this bug has been fixed in version 5.7.27 (exactly) and version 8.0.16 (possibly all 8.., only tested 8.0.16 and 8.0.18).
Version 8.0.16 test:
Server version: 8.0.16 MySQL Community Server - GPL
Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> SELECT REPLACE(UUID(),'-','-') from INFORMATION_SCHEMA.TABLES LIMIT 3;
+--------------------------------------+
| REPLACE(UUID(),'-','-') |
+--------------------------------------+
| 96f9205a-fdc6-11e9-87de-d05099466715 |
| 96f920f9-fdc6-11e9-87de-d05099466715 |
| 96f9213e-fdc6-11e9-87de-d05099466715 |
+--------------------------------------+
3 rows in set (0.00 sec)
This function in MySQL is used to return a Universal Unique Identifier (UUID) generated according to RFC 4122, “A Universally Unique Identifier (UUID) URN Namespace”. It is designed as a number that is universally unique. Two UUID values are expected to be distinct, even they are generated on two independent servers.
Pros. Using UUID for a primary key brings the following advantages: UUID values are unique across tables, databases, and even servers that allow you to merge rows from different databases or distribute databases across servers. UUID values do not expose the information about your data so they are safer to use in a URL.
A UUID is a Universal Unique Identifier specified by RFC 4122 (It is a Universally Unique Identifier URN Namespace) and 128-bit long value. It is designed in such a way that it generates a number which is unique globally according to space and time.
IS_UUID() function in MySQL. This function in MySQL is used to check whether the given Universal Unique Identifier (UUID) is valid or not. It returns 1 if the argument is a valid string-format UUID, 0 if the argument is not a valid UUID, and NULL if the argument is NULL. Parameter : This method accepts one parameter.
MySQL 5.6.46, 5.7.28, nor 8.0.18 do not appear to have this issue. Upgrade if you can.
One of the benefits of upgrading is you can now use functions as column defaults. This allows you to set the UUID as the default for your primary key sidestepping this and many other problems. You can also use uuid_to_bin
and bin_to_uuid
.
create table MyTable (
uid binary(16) primary key default(uuid_to_bin(uuid())),
tableName text not null
);
INSERT INTO MyTable (tableName)
SELECT TABLE_NAME from INFORMATION_SCHEMA.TABLES;
If you can't upgrade, you can use a trigger to set the default primary key as in this answer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With