Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to generate 63 million prize codes

Tags:

php

I am trying to come up with a "smart" and "secure" way of generating about 63million unique codes to be used in a competition. The codes will be about 10 characters long.

Has anyone done anything similar or aware of any "hurdles" that might arise from this issues? How do we minimize the probability of someone being able to guess the codes?

This project will be done in PHP, but that won't really matter, it's more of the logic that's an issue here.

Any feedback would be really appreciated.

UPDATE Just to clarify it will be 10 characters of case insensitive Alpha Numeric Values. A-Z0-9

like image 365
Shadi Almosri Avatar asked Jul 02 '09 11:07

Shadi Almosri


4 Answers

Syntax:

You probably will have people copying these codes, so that means these codes should be easy to copy. 10^10 is too small, as Gamecat points out. kgiannakakis has a better idea, but that causes another problem: "1" looks a lot like "I". "0", "C", "O" and "Q" are also quite similar. This is not a big problem. Define a safe alfabet: "0123456789ABDEFGHJKLMNPRSTUVXYZ" (leaves out COIQ) From the comments: depending on the fonts you pick, 5/S and U/V may also be visually ambiguous; replace as required. This is a 32 symbol (5 bit) code. A 10 character code is a 50 bits number. Those should be fairly trivial to generate, sort, copy, compare etc. Chances of being guessed are about 0.63E-7

Since the codes are too long to remember, users will need a resting point when copying them. So split the string in two or three parts and make sure the input field matches this breakdown.

E.g. AKG3L-45TEE => two groups of 5, and even if you can't remember 5 chars it's a lot easier to find the point back where you stopped reading.


How to generate them:

This is fairly simple. You don't need a particularly sophistciated algorithm to generate candidates. You can generate 10 random numbers per code needed, take 5 bits from each number (typically the middle bits are best, e.g. (rand()/64) modulo 32). Use this value [0-31] as index into your alphabet. Create a database table with this string as a primary key, and insert until the table has 63 million entries. You probably want to add "generated on" and "redeemed on" dates to this table.

like image 78
MSalters Avatar answered Nov 15 '22 08:11

MSalters


If I understand you correctly, you want to create 63 milion codes of 10 digits that have a low "guess factor".

There are 10,000,000,000 valid combinations. Of these 63,000,000 are price numbers. 63 / 10,000 = 0.0063. So each guess has a 0,63% chance of success. Does not sound high, but with brute force, the numbers are quite easy to obtain.

Are you sure the 63 on 10,000 ratio is good enough?

like image 43
Toon Krijthe Avatar answered Nov 15 '22 07:11

Toon Krijthe


Generate a set of truly random, unique 64-bit numbers over the range 0 - 250-1. You'll need to keep track of the ones you have seen and reject duplicates. Use each 5 bits of the lower 50 bits of this number pull from a 32-character alphabet -- basically all the letters in the English alphabet (upper or lowercase) minus L and O plus the digits 2-9 (this reduces confusion between l/1 and 0/O). For 63 million codes, this would give you a 0.000006% probability (63,000,000/250) of choosing a valid code sequence randomly.

I've also done this using an autogenerated, primary key (int) and bit-interleaving it with a 32-bit random value. In this case I used the full 64-bits to generate 13 characters from the alphabet and added two random characters at fixed positions for a 15-character code. When redeeming the code you reverse the algorithm to extract the key and the randomness, throwing away the two extra random characters, then compare the randomness to that found stored with the key to validate the code.

like image 5
tvanfosson Avatar answered Nov 15 '22 07:11

tvanfosson


Be careful when using alphanumerics for codes, as you don't want to accidentally generate anything confusing or embarrassing. To avoid confusion I suggest removing 1 and L, 0 and O, and maybe 8 and B. To avoid embarrassment consider removing all vowels so that you cannot accidentally spell anything (use your imagination here).

like image 4
Alan Avatar answered Nov 15 '22 06:11

Alan