Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Injective pairing function

I'm looking for a pairing function f: ZxZ -> Z, with the following characteristics:

  • It doesn't need to be reversible. I only need it to be injecive (different pairs map to different integers), I never need to compute the pair back.
  • It is defined over Z (signed integers)
  • It is efficiently computable

At the moment, I'm using f(x,y) = x + (max(x)-min(x)+1) * y

It works, I'm just wondering whether there could be another function that uses the result space more efficiently, considering that:

  • x,y are signed integers up to 64bits
  • f(x,y) is an integer, at most 64 bits
  • len(f(x,y)) <= 64 bits is easily computable

I do know that this means I cannot map all x,y combinations without the result to overflow. I'm happy enough with being able to establish whether the conversion would fit in 64bits or not. For this reason, the ideal mapping function would use the available 64bits as efficiently as possible.

Any tip?

like image 800
cornuz Avatar asked Feb 05 '26 18:02

cornuz


1 Answers

To encode two 64 bit integers into a unique single number, there are 2^64 * (2^64 -1) combinations of inputs possible, so by the obvious Pigeonhole Principle, we need an output of size at least 2^64 * (2^64 -1), which is equal to 2^128 - 2^64, or in other words, you need a capacity of 128 bits to hold all the possible outputs.


I know it can't exist for all values. But that depends on the values, not on the data types. E.g. f(4,5) can still be done, even when 4 and 5 are stored as 64bit integers. It's easy to check, depending on the function used, for overflows (in that case I wouldn't use the mapping).

You know that. That said, as you say you could have a cap on maximum values for your 64 bit inputs. The output then can be 64 bit signed or unsigned integer.

Output being signed, an implementation in C#:

public static long GetHashCode(long a, long b)
{
    if (a < int.MinValue || a > int.MaxValue || b < int.MinValue || b > int.MaxValue)
        throw new ArgumentOutOfRangeException();

    var A = (ulong)(a >= 0 ? 2 * a : -2 * a - 1);
    var B = (ulong)(b >= 0 ? 2 * b : -2 * b - 1);
    var C = (long)((A >= B ? A * A + A + B : A + B * B) / 2);
    return a < 0 && b < 0 || a >= 0 && b >= 0 ? C : -C - 1;
}

Output being unsigned, an implementation in C#:

public static ulong GetHashCode(long a, long b)
{
    if (a < int.MinValue || a > int.MaxValue || b < int.MinValue || b > int.MaxValue)
        throw new ArgumentOutOfRangeException();

    var A = (ulong)(a >= 0 ? 2 * a : -2 * a - 1);
    var B = (ulong)(b >= 0 ? 2 * b : -2 * b - 1);
    return A >= B ? A * A + A + B : A + B * B;
}

The unsigned implementation will be slightly faster because of the fewer calculations. The lower and upper bound to uniquely pair is int.MaxValue (-2147483648) and int.MaxValue(2147483647). The original function is taken from here. The Elegant Pairing function mentioned in the link is the most space efficient possible since it maps to every single point in the available space. For more on similar methods, see Mapping two integers to one, in a unique and deterministic way

like image 60
nawfal Avatar answered Feb 09 '26 10:02

nawfal



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!