We are using MySQL and developing an application where we'd like the ID sequence not to be publicly visible... the IDs are hardly top secret and there is no significant issue if someone indeed was able to decode them.
So, a hash is of course the obvious solution, we are currently using MD5... 32bit integers go in, and we trim the MD5 to 64bits and then store that. However, we have no idea how likely collisions are when you trim like this (especially since all numbers come from autoincrement or the current time). We currently check for collisions, but since we may be inserting 100.000 rows at once the performance is terrible (can't bulk insert).
But in the end, we really don't need the security offered by the hashes and they consume unnecessary space and also require an additional index... so, is there any simple and good enough function/algorithm out there that guarantees one-to-one mapping for any number without obvious visual patterns for sequential numbers?
EDIT: I'm using PHP which does not support integer arithmetic by default, but after looking around I found that it could be cheaply replicated with bitwise operators. Code for 32bit integer multiplication can be found here: http://pastebin.com/np28xhQF
Here, y is a natural number for every 'y', there is a value of x which is a natural number. Hence, f is onto. So, the function f:N→N , given by f(1)=f(2)=1 is not one-one but onto.
Zero is a positive integer since it doesn't carry any negative sign. Zero is a positive integer since it doesn't carry any negative sign.
You could simply XOR with 0xDEADBEEF, if that's good enough.
Alternatively multiply by an odd number mod 2^32. For the inverse mapping just multiply by the multiplicative inverse
Example: n = 2345678901; multiplicative inverse (mod 2^32): 2313902621 For the mapping just multiply by 2345678901 (mod 2^32):
1 --> 2345678901 2 --> 396390506
For the inverse mapping, multiply by 2313902621.
For our application, we use bit shuffle to generate the ID. It is very easy to reverse back to the original ID.
func (m Meeting) MeetingCode() uint {
hashed := (m.ID + 10000000) & 0x00FFFFFF
chunks := [24]uint{}
for i := 0; i < 24; i++ {
chunks[i] = hashed >> i & 0x1
}
shuffle := [24]uint{14, 1, 15, 21, 0, 6, 5, 10, 4, 3, 20, 22, 2, 23, 8, 13, 19, 9, 18, 12, 7, 11, 16, 17}
result := uint(0)
for i := 0; i < 24; i++ {
result = result | (chunks[shuffle[i]] << i)
}
return result
}
If you want to ensure a 1:1 mapping then use an encryption (i.e. a permutation), not a hash. Encryption has to be 1:1 because it can be decrypted.
If you want 32 bit numbers then use Hasty Pudding Cypher or just write a simple four round Feistel cypher.
Here's one I prepared earlier:
import java.util.Random;
/**
* IntegerPerm is a reversible keyed permutation of the integers.
* This class is not cryptographically secure as the F function
* is too simple and there are not enough rounds.
*
* @author Martin Ross
*/
public final class IntegerPerm {
//////////////////
// Private Data //
//////////////////
/** Non-zero default key, from www.random.org */
private final static int DEFAULT_KEY = 0x6CFB18E2;
private final static int LOW_16_MASK = 0xFFFF;
private final static int HALF_SHIFT = 16;
private final static int NUM_ROUNDS = 4;
/** Permutation key */
private int mKey;
/** Round key schedule */
private int[] mRoundKeys = new int[NUM_ROUNDS];
//////////////////
// Constructors //
//////////////////
public IntegerPerm() { this(DEFAULT_KEY); }
public IntegerPerm(int key) { setKey(key); }
////////////////////
// Public Methods //
////////////////////
/** Sets a new value for the key and key schedule. */
public void setKey(int newKey) {
assert (NUM_ROUNDS == 4) : "NUM_ROUNDS is not 4";
mKey = newKey;
mRoundKeys[0] = mKey & LOW_16_MASK;
mRoundKeys[1] = ~(mKey & LOW_16_MASK);
mRoundKeys[2] = mKey >>> HALF_SHIFT;
mRoundKeys[3] = ~(mKey >>> HALF_SHIFT);
} // end setKey()
/** Returns the current value of the key. */
public int getKey() { return mKey; }
/**
* Calculates the enciphered (i.e. permuted) value of the given integer
* under the current key.
*
* @param plain the integer to encipher.
*
* @return the enciphered (permuted) value.
*/
public int encipher(int plain) {
// 1 Split into two halves.
int rhs = plain & LOW_16_MASK;
int lhs = plain >>> HALF_SHIFT;
// 2 Do NUM_ROUNDS simple Feistel rounds.
for (int i = 0; i < NUM_ROUNDS; ++i) {
if (i > 0) {
// Swap lhs <-> rhs
final int temp = lhs;
lhs = rhs;
rhs = temp;
} // end if
// Apply Feistel round function F().
rhs ^= F(lhs, i);
} // end for
// 3 Recombine the two halves and return.
return (lhs << HALF_SHIFT) + (rhs & LOW_16_MASK);
} // end encipher()
/**
* Calculates the deciphered (i.e. inverse permuted) value of the given
* integer under the current key.
*
* @param cypher the integer to decipher.
*
* @return the deciphered (inverse permuted) value.
*/
public int decipher(int cypher) {
// 1 Split into two halves.
int rhs = cypher & LOW_16_MASK;
int lhs = cypher >>> HALF_SHIFT;
// 2 Do NUM_ROUNDS simple Feistel rounds.
for (int i = 0; i < NUM_ROUNDS; ++i) {
if (i > 0) {
// Swap lhs <-> rhs
final int temp = lhs;
lhs = rhs;
rhs = temp;
} // end if
// Apply Feistel round function F().
rhs ^= F(lhs, NUM_ROUNDS - 1 - i);
} // end for
// 4 Recombine the two halves and return.
return (lhs << HALF_SHIFT) + (rhs & LOW_16_MASK);
} // end decipher()
/////////////////////
// Private Methods //
/////////////////////
// The F function for the Feistel rounds.
private int F(int num, int round) {
// XOR with round key.
num ^= mRoundKeys[round];
// Square, then XOR the high and low parts.
num *= num;
return (num >>> HALF_SHIFT) ^ (num & LOW_16_MASK);
} // end F()
} // end class IntegerPerm
Do what Henrik said in his second suggestion. But since these values seem to be used by people (else you wouldn't want to randomize them). Take one additional step. Multiply the sequential number by a large prime and reduce mod N where N is a power of 2. But choose N to be 2 bits smaller than you can store. Next, multiply the result by 11 and use that. So we have:
Hash = ((count * large_prime) % 536870912) * 11
The multiplication by 11 protects against most data entry errors - if any digit is typed wrong, the result will not be a multiple of 11. If any 2 digits are transposed, the result will not be a multiple of 11. So as a preliminary check of any value entered, you check if it's divisible by 11 before even looking in the database.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With