Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP short unique ID generation using auto_increment?

I would like to generate a short, unique ID without having to check for collisions.

I currently do something like this, but the ID I currently generate is random and checking for collisions in a loop is annoying and will get expensive if the number of records grows significantly.

Normally worrying about collisions isn't an issue, but the unique ID I want to generate is a short unique string 5-8 characters, alpha-numeric, like tinyurl does.

EDIT: I would like to start out with 5 characters and if I hit 60 million entries, then go to 6.. so on and so forth.

To this end, I was thinking I could use an auto_increment value that is hidden from the users, and present them instead with an MD5 or some other method to generate a unique string from that.

Generated strings should not appear to be linear, so simply converting the auto_incremented ID into base 36 [0-9A-Z] is a bit too simplistic, but a function something like that is where I'm going with this.

EDIT: Security is not an issue as this will not be used to secure information. It is simply a shortcut to a longer string. Thank you.

Thank you for your suggestions and sorry for the delay. Dentist..

like image 637
Daren Schwenke Avatar asked Oct 30 '09 14:10

Daren Schwenke


1 Answers

You'll need something that's correct by construction, i.e. a permutation function: this is a function that does a one-to-one, reversible mapping of one integer (your sequential counter) to another. Some examples (any combination of these should also work):

  • inverting some of the bits (f.i. using an XOR, ^ in PHP)
  • swapping the places of bits (($i & 0xc) >> 2 | ($i & 0x3) << 2), or just reversing the order of all bits
  • adding a constant value modulo your maximum range (must be a factor of two, if you're combining this with the ones above)

Example: this function will convert 0, 1, 2, 3, 5, .. into 13, 4, 12, 7, 15, .. for numbers up to 15:

$i=($input+97) & 0xf;
$result=((($i&0x1) << 3) + (($i&0xe) >> 1)) ^ 0x5;

EDIT

An easier way would to use a linear congruential generator (LCG, which is usually used for generating random numbers), which is defined by a formula of the form:

X_n+1 = (a * X_n + c) mod m

For good values of a, c and m, the sequence of X_0, X_1 .. X_m-1 will contain all numbers between 0 and m-1 exactly once. Now you can start from a linearly increasing index, and use the next value in the LCG sequence as your "secret" key.

EDIT2

Implementation: You can design your own LCG parameters, but if you get it wrong it won't cover the full range (and thus have duplicates) so I'll use a published and tried set of parameters here from this paper:

a = 16807, c = 0, m = 2147483647

This gives you a range of 2**31. With pack() you can get the resulting integer as a string, base64_encode() makes it a readable string (of up to 6 significant characters, 6 bits per byte) so this could be your function:

substr(base64_encode(pack("l", (16807 * $index) % 2147483647)), 0, 6)
like image 187
Wim Avatar answered Nov 10 '22 10:11

Wim