I'm working on a project, in which I should generate at least 70000 codes which contain 8 alphanumeric characters. The codes must be unique. currently I am using php to generate these codes with the following function :
function random_unique_serial($length, PDO $conn) {
$codeCheck=FALSE;
while (!$codeCheck) {
$characters = '0123456789abcdefghijklmnopqrstuvwxyz';
$charactersLength = strlen($characters);
$randomCode = '';
for ($i = 0; $i < $length; $i++) {
$randomCode .= $characters[rand(0, $charactersLength - 1)];
}
$sql = "SELECT * FROM codes WHERE code=:code";
$st = $conn->prepare($sql);
$st->bindvalue(":code", $randomCode, PDO::PARAM_STR);
$st->execute();
$count = $st->rowcount();
if ($count==0) {
$codeCheck=TRUE;
} else {
$codeCheck=FALSE;
}
}
return $randomCode;
}
As you see this codes checks the database for every single code generated to make sure it is not a duplicate. This should work theoretically. However this is very slow and causes the request to time out. I tried increasing execution time but that also didn't help.
Then I decided to use a database side approach and used this solution : Generating a random & unique 8 character string using MySQL
This is also very slow and some of the generated codes are less than 8 characters long.
could you please suggest a better solution?
Use UUID() to Generate Random and Unique Strings in MySQL Another method of producing the 8-character string in MySQL is LEFT(UUID(),8) , as given above. The RFC 4122 (Universally Unique Identifier URN Namespace) specified the UUID() (Universal Unique Identifier) which is a 128-bit long value.
select FLOOR( RAND() * (maximumValue-minimumValue) + minimumValue) as anyVariableName; Let us check with some maximum and minimum value. The maximum value we are considering is 200 and minimum is 100. The random number will be between 100 and 200 including 100 and 200 itself.
There's no built-in way to generate a random string in MySQL, so you need to create a workaround using the provided functions. One of the easiest ways to generate a random string is to use a combination of the SUBSTR() function, the MD5() function, and the RAND() function.
If you need a string of random digits up to 32 characters for test data or just need some junk text to fill a field, SQL Server's NEWID() function makes this simple. NEWID() is used to create a new GUID (globally unique identifier), and we can use that as a base to get a string of random characters.
Create your table structure:
CREATE TABLE t (code CHAR(8) CHARACTER SET ascii COLLATE ascii_general_ci NOT NULL UNIQUE);
Define a PHP function to generate a random string:
function random_string(integer $length = 8): string {
return bin2hex(mcrypt_create_iv(ceil($length/2), MCRYPT_DEV_URANDOM));
}
Use PHP to build a multi-value INSERT statement, ram that into the database, count how many were inserted, and repeat until the required number are inserted:
function insert_records(\PDO $pdo, integer $need = 70000): void {
$have = 0;
while ($have < $need) {
// generate multi value INSERT
$sql = 'INSERT IGNORE INTO t VALUES ';
for ($i = 1; $i < $need; $i++) {
$sql .= sprintf('("%s"),', random_string());
}
$sql .= sprintf('("%s");', random_string());
// pass to database and ask how many records were inserted
$result = $pdo->query($sql);
$count = $result->rowCount();
// adjust bookkeeping values so we know how many we have and how many
// we need
$need -= $count;
$have += $count;
}
}
On my machine (Amazon Linux c2.small), the run time for 70k records is about 2 seconds:
real 0m2.136s
user 0m1.256s
sys 0m0.212s
The relevant tricks in this code, to make it fast, are:
INSERT INTO ... VALUES (), (), ... ();
- really helps this as it minimizes the total amount of statement processing MySQL has to do and it tells us how many records were inserted without having to do another query.INSERT IGNORE
to avoid having to check for the existence of every single code we insert, which is really really expensive.mcrypt_create_iv
is a fast generator that is cryptographically secure, so it provides an ideal balance of security and performance.CHAR
to remove unnecessary byte overhead and UNIQUE
to enforce de-duplication.I'd do that with mysql alone, a stored procedure will help - you can still create and call that with php. The stored procedure uses the substring of a md5 hash, created from rand(). The column where the string is to be inserted needs to be unique
. Replace table name and column in in this part:
insert ignore into foo (`uniqueString`)
delimiter //
create procedure createRandomString (in num int)
begin
declare i int default 0;
while i < num do
insert ignore into foo (`uniqueString`) values (substr(md5(rand()), 1, 8));
set i = i + 1;
end while;
end //
delimiter ;
call createRandomString (70000);
I did a quick test, I got 69934 random unique strings inserted on a remote db (from the 70000 runs) within 10s 603ms. Running the same procedure with 80000 as parameter
call createRandomString(80000);
runs 12s 434ms for me, inserting 77354 rows - so you have at least 70000 in little time.
Will produce results like this:
If you want to make sure to have exactly the number of rows inserted as called, use this (but note to set the max_sp_recursion_depth to what it was before after calling the procedure, default is 0):
delimiter //
create procedure createRandomString2 (in num int)
begin
declare i int default 0;
while i < num do
insert ignore into foo (uniqueString) values (substr(md5(rand()), 1, 8));
set i = i + 1;
end while;
if (select count(id) from foo) < num then
call createRandomString2(num - (select count(id) from foo));
END IF;
end //
delimiter ;
set max_sp_recursion_depth = 100;
call createRandomString7 (70000);
set max_sp_recursion_depth = 0;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With