Reversible hash function?

Tags:

I need a reversible hash function (obviously the input will be much smaller in size than the output) that maps the input to the output in a random-looking way. Basically, I want a way to transform a number like "123" to a larger number like "9874362483910978", but not in a way that will preserve comparisons, so it must not be always true that, if x1 > x2, f(x1) > f(x2) (but neither must it be always false).

The use case for this is that I need to find a way to transform small numbers into larger, random-looking ones. They don't actually need to be random (in fact, they need to be deterministic, so the same input always maps to the same output), but they do need to look random (at least when base64encoded into strings, so shifting by Z bits won't work as similar numbers will have similar MSBs).

Also, easy (fast) calculation and reversal is a plus, but not required.

I don't know if I'm being clear, or if such an algorithm exists, but I'd appreciate any and all help!

970

asked Nov 25 '10 03:11

Stavros Korokithakis

2 Answers

None of the answers provided seemed particularly useful, given the question. I had the same problem, needing a simple, reversible hash for not-security purposes, and decided to go with bit relocation. It's simple, it's fast, and it doesn't require knowing anything about boolean maths or crypo algorithms or anything else that requires actual thinking.

The simplest would probably be to just move half the bits left, and the other half right:

def hash(n):   return ((0x0000FFFF & n)<<16) + ((0xFFFF0000 & n)>>16)

This is reversible, in that hash(hash(n)) = n, and has non-sequential pairs {n,m}, n < m, where hash(m) < hash(n).

To get a less sequential looking implementation, you might also want to consider an interlace reordering from [msb,z,...,a,lsb] to [msb,lsb,z,a,...] or [lsb,msb,a,z,...] or any other relocation you feel gives an appropriately non-sequential sequence for the numbers you deal with, or even add a XOR on top of that to make it look even less sequential.

(The above function is safe for numbers that fit in 32 bits, larger numbers are guaranteed to cause collisions and would need some more bit mask coverage to prevent problems. That said, 32 bits is usually enough for any non-security uid).

Also have a look at the multiplicative inverse answer given by Andy Hayden, below.

200

answered Sep 28 '22 04:09

Mike 'Pomax' Kamermans

Another simple solution is to use multiplicative inverses (see Eri Clippert's blog):

we showed how you can take any two coprime positive integers x and m and compute a third positive integer y with the property that (x * y) % m == 1, and therefore that (x * z * y) % m == z % m for any positive integer z. That is, there always exists a “multiplicative inverse”, that “undoes” the results of multiplying by x modulo m.

We take a large number e.g. 4000000000 and a large co-prime number e.g. 387420489:

def rhash(n):     return n * 387420489 % 4000000000  >>> rhash(12) 649045868

We first calculate the multiplicative inverse with modinv which turns out to be 3513180409:

>>> 3513180409 * 387420489 % 4000000000 1

Now, we can define the inverse:

def un_rhash(h):     return h * 3513180409 % 4000000000  >>> un_rhash(649045868)  # un_rhash(rhash(12)) 12

Note: This answer is fast to compute and works for numbers up to 4000000000, if you need to handle larger numbers choose a sufficiently large number (and another co-prime).

You may want to do this with hexidecimal (to pack the int):

def rhash(n):     return "%08x" % (n * 387420489 % 4000000000)  >>> rhash(12) '26afa76c'  def un_rhash(h):     return int(h, 16) * 3513180409 % 4000000000  >>> un_rhash('26afa76c')  # un_rhash(rhash(12)) 12

If you choose a relatively large co-prime then this will seem random, be non-sequential and also be quick to calculate.

answered Sep 28 '22 03:09

Andy Hayden

Related questions
                            
                                Why doesn't Python's mmap work with large files?
                            
                                Handle spaces in argparse input
                            
                                How to parse strings to look like sys.argv
                            
                                How to print float to n decimal places including trailing 0s?
                            
                                Allow only positive decimal numbers
                            
                                Specify where to install 'tests_require' dependencies of a distribute/setuptools package
                            
                                Using an HTTP PROXY - Python [duplicate]
                            
                                equivalent of `a?b:c` [duplicate]
                            
                                Why does `type(myField)` return `<type 'instance'>` and not `<type 'Field'>`?
                            
                                Calling the "source" command from subprocess.Popen
                            
                                More than one static path in local Flask instance
                            
                                Using a comparator function to sort
                            
                                Darken or lighten a color in matplotlib
                            
                                Apply CSS class to Pandas DataFrame using to_html
                            
                                Pymongo keeps refusing the connection at 27017
                            
                                Python: list of lists
                            
                                -bash: ./manage.py: Permission denied
                            
                                Python causing: IOError: [Errno 28] No space left on device: '../results/32766.html' on disk with lots of space
                            
                                TypeError: Required argument 'outImg' (pos 6) not found
                            
                                In django, how do I call the subcommand 'syncdb' from the initialization script?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Reversible hash function?

Tags:

python

hash

Stavros Korokithakis

People also ask

2 Answers

Mike 'Pomax' Kamermans

Andy Hayden

Recent Activity

Donate For Us