Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

One-to-one integer mapping function

We are using MySQL and developing an application where we'd like the ID sequence not to be publicly visible... the IDs are hardly top secret and there is no significant issue if someone indeed was able to decode them.

So, a hash is of course the obvious solution, we are currently using MD5... 32bit integers go in, and we trim the MD5 to 64bits and then store that. However, we have no idea how likely collisions are when you trim like this (especially since all numbers come from autoincrement or the current time). We currently check for collisions, but since we may be inserting 100.000 rows at once the performance is terrible (can't bulk insert).

But in the end, we really don't need the security offered by the hashes and they consume unnecessary space and also require an additional index... so, is there any simple and good enough function/algorithm out there that guarantees one-to-one mapping for any number without obvious visual patterns for sequential numbers?

EDIT: I'm using PHP which does not support integer arithmetic by default, but after looking around I found that it could be cheaply replicated with bitwise operators. Code for 32bit integer multiplication can be found here: http://pastebin.com/np28xhQF

like image 546
Andreas Avatar asked Sep 02 '11 12:09

Andreas


People also ask

Can a function be onto and not one to one?

Here, y is a natural number for every 'y', there is a value of x which is a natural number. Hence, f is onto. So, the function f:N→N , given by f(1)=f(2)=1 is not one-one but onto.

Is 0 a positive integer?

Zero is a positive integer since it doesn't carry any negative sign. Zero is a positive integer since it doesn't carry any negative sign.


4 Answers

You could simply XOR with 0xDEADBEEF, if that's good enough.

Alternatively multiply by an odd number mod 2^32. For the inverse mapping just multiply by the multiplicative inverse

Example: n = 2345678901; multiplicative inverse (mod 2^32): 2313902621 For the mapping just multiply by 2345678901 (mod 2^32):

1 --> 2345678901 2 --> 396390506

For the inverse mapping, multiply by 2313902621.

like image 189
Henrik Avatar answered Oct 17 '22 19:10

Henrik


For our application, we use bit shuffle to generate the ID. It is very easy to reverse back to the original ID.

func (m Meeting) MeetingCode() uint {
    hashed := (m.ID + 10000000) & 0x00FFFFFF
    chunks := [24]uint{}
    for i := 0; i < 24; i++ {
        chunks[i] = hashed >> i & 0x1
    }
    shuffle := [24]uint{14, 1, 15, 21, 0, 6, 5, 10, 4, 3, 20, 22, 2, 23, 8, 13, 19, 9, 18, 12, 7, 11, 16, 17}
    result := uint(0)
    for i := 0; i < 24; i++ {
        result = result | (chunks[shuffle[i]] << i)
    }
    return result
}
like image 30
Ciel Avatar answered Sep 22 '22 08:09

Ciel


If you want to ensure a 1:1 mapping then use an encryption (i.e. a permutation), not a hash. Encryption has to be 1:1 because it can be decrypted.

If you want 32 bit numbers then use Hasty Pudding Cypher or just write a simple four round Feistel cypher.

Here's one I prepared earlier:

import java.util.Random;

/**
 * IntegerPerm is a reversible keyed permutation of the integers.
 * This class is not cryptographically secure as the F function
 * is too simple and there are not enough rounds.
 *
 * @author Martin Ross
 */
public final class IntegerPerm {
    //////////////////
    // Private Data //
    //////////////////

    /** Non-zero default key, from www.random.org */
    private final static int DEFAULT_KEY = 0x6CFB18E2;

    private final static int LOW_16_MASK = 0xFFFF;
    private final static int HALF_SHIFT = 16;
    private final static int NUM_ROUNDS = 4;

    /** Permutation key */
    private int mKey;

    /** Round key schedule */
    private int[] mRoundKeys = new int[NUM_ROUNDS];

    //////////////////
    // Constructors //
    //////////////////

    public IntegerPerm() { this(DEFAULT_KEY); }

    public IntegerPerm(int key) { setKey(key); }

    ////////////////////
    // Public Methods //
    ////////////////////

    /** Sets a new value for the key and key schedule. */
    public void setKey(int newKey) {
        assert (NUM_ROUNDS == 4) : "NUM_ROUNDS is not 4";
        mKey = newKey;

        mRoundKeys[0] = mKey & LOW_16_MASK;
        mRoundKeys[1] = ~(mKey & LOW_16_MASK);
        mRoundKeys[2] = mKey >>> HALF_SHIFT;
        mRoundKeys[3] = ~(mKey >>> HALF_SHIFT);
    } // end setKey()

    /** Returns the current value of the key. */
    public int getKey() { return mKey; }

    /**
     * Calculates the enciphered (i.e. permuted) value of the given integer
     * under the current key.
     *
     * @param plain the integer to encipher.
     *
     * @return the enciphered (permuted) value.
     */
    public int encipher(int plain) {
        // 1 Split into two halves.
        int rhs = plain & LOW_16_MASK;
        int lhs = plain >>> HALF_SHIFT;

        // 2 Do NUM_ROUNDS simple Feistel rounds.
        for (int i = 0; i < NUM_ROUNDS; ++i) {
            if (i > 0) {
                // Swap lhs <-> rhs
                final int temp = lhs;
                lhs = rhs;
                rhs = temp;
            } // end if
            // Apply Feistel round function F().
            rhs ^= F(lhs, i);
        } // end for

        // 3 Recombine the two halves and return.
        return (lhs << HALF_SHIFT) + (rhs & LOW_16_MASK);
    } // end encipher()

    /**
     * Calculates the deciphered (i.e. inverse permuted) value of the given
     * integer under the current key.
     *
     * @param cypher the integer to decipher.
     *
     * @return the deciphered (inverse permuted) value.
     */
    public int decipher(int cypher) {
        // 1 Split into two halves.
        int rhs = cypher & LOW_16_MASK;
        int lhs = cypher >>> HALF_SHIFT;

        // 2 Do NUM_ROUNDS simple Feistel rounds.
        for (int i = 0; i < NUM_ROUNDS; ++i) {
            if (i > 0) {
                // Swap lhs <-> rhs
                final int temp = lhs;
                lhs = rhs;
                rhs = temp;
            } // end if
            // Apply Feistel round function F().
            rhs ^= F(lhs, NUM_ROUNDS - 1 - i);
        } // end for

        // 4 Recombine the two halves and return.
        return (lhs << HALF_SHIFT) + (rhs & LOW_16_MASK);
    } // end decipher()

    /////////////////////
    // Private Methods //
    /////////////////////

    // The F function for the Feistel rounds.
    private int F(int num, int round) {
        // XOR with round key.
        num ^= mRoundKeys[round];
        // Square, then XOR the high and low parts.
        num *= num;
        return (num >>> HALF_SHIFT) ^ (num & LOW_16_MASK);
    } // end F()

} // end class IntegerPerm
like image 5
rossum Avatar answered Oct 17 '22 19:10

rossum


Do what Henrik said in his second suggestion. But since these values seem to be used by people (else you wouldn't want to randomize them). Take one additional step. Multiply the sequential number by a large prime and reduce mod N where N is a power of 2. But choose N to be 2 bits smaller than you can store. Next, multiply the result by 11 and use that. So we have:

Hash = ((count * large_prime) % 536870912) * 11

The multiplication by 11 protects against most data entry errors - if any digit is typed wrong, the result will not be a multiple of 11. If any 2 digits are transposed, the result will not be a multiple of 11. So as a preliminary check of any value entered, you check if it's divisible by 11 before even looking in the database.

like image 2
phkahler Avatar answered Oct 17 '22 19:10

phkahler