Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert sequence of numbers to random-looking IDs?

Tags:

random

I'm working on an application where I need to generate unique, non-sequential IDs. One of the constraints I have is that they must consist of 3 digits followed by 2 letters (only about 600k IDs). Given my relatively small pool of IDs I was considering simply generating all possible IDs, shuffling them and putting them into a database. Since, internally, I'll have a simple, sequential, ID to use, it'll be easy to pluck them out one at a time & be sure I don't have any repeats.

This doesn't feel like a very satisfying solution. Does anyone out there have a more interesting method of generating unique IDs from a limited pool than this 'lottery' method?

like image 221
Sean McSomething Avatar asked Apr 15 '09 00:04

Sean McSomething


3 Answers

This can be done a lot of different ways, depending on what you are trying to optimize (speed, memory usage, etc.).

ID pattern = ddd c1c[0]

Option 1 (essentially like hashing, similar to Zak's):
1 Generate a random number between 0 and number of possibilities (676k).
2- Convert number to combination

    ddd = random / (26^2)
    c[0] = random % (26)
    c[1] = (random / 26) % 26

3- Query DB for existence of ID and increment until a free one is found.

Option 2 (Linear feedback shift register, see wikipedia):
1- Seed with a random number in range (0,676k). (See below why you can't seed with '0')
2- Generate subsequent random numbers by applying the following to the current ID number

    num = (num >> 1) ^ (-(num & 1u) & 0x90000u);

3- Skip IDs larger than range (ie 0xA50A0+)
4- Convert number into ID format (as above)
*You will need to save the last number generated that was used for an ID, but you won't need to query the DB to see if it is used. This solution will enumerate all possible IDs except [000 AA] due to the way the LFSR works.

[edit] Since your range is actually larger than you need, you can get back [000 AA] by subtracting 1 before you convert to the ID and have your valid range be (0,0xA50A0]

like image 78
Timm Avatar answered Sep 28 '22 04:09

Timm


Use a finite group. Basically, take a 32 or 64-bit integer, and find a large number that is coprime to the maximum value for your integer; call this number M. Then, for all integers n, n * M will result in a unique number that has lots of digits.

This has the advantage that you don't need to pre-fill the database, or run a separate select query -- you can do this all from within one insert statement, by having your n just be an auto-increment, and have a separate ID column that defaults to the n * M.

like image 28
Don Werve Avatar answered Sep 28 '22 05:09

Don Werve


You could generate a random ID conforming to that standard, do a DB select to see if it exists already, then insert it into a DB to note it has been "used". For the first 25% of the life of that scheme (or about 150k entries), it should be relatively fast to generate new random ID's. After that though, it will take longer and longer, and you might as well pre-fill the table to look for free IDs.

like image 37
Zak Avatar answered Sep 28 '22 04:09

Zak