I am working on a Fountain Code based file transfer system. In this system blocks of data are downloaded, combined with an xor function. I want to verify the blocks as they arrive.
What I need is a cryptographically secure hash function which has the property:
Hash(A) ^ Hash(B) == Hash(A ^ B)
does such a thing exist?
Note: The data blocks must be combined with an xor function, the hashes can be combined with any function you like, so long as it's reasonably cheap to compute.
If the identity you are requesting is exactly
Hash(A) ^ Hash(B) == Hash(A ^ B)
then no, no such cryptographically secure hash function is possible. This is because your function would then be a linear map (over the field with two elements) from the space of possible blocks to the space of possible hashes.
Well, suppose your map takes blocks of length 6 and returns hashes of length 3, and that these are some of the hashes:
Hash(000001) = 010
Hash(000010) = 111
Hash(000100) = 001
Hash(001000) = 101
Hash(010000) = 110
Hash(100000) = 001
Then you can compute the hash of any given block by linear combinations of the above. For example,
Hash(101000) = Hash(100000) ^ Hash(001000) = 001 ^ 101 = 100.
This means that your hash function can be represented by a 6-by-3 matrix.
Wikipedia defines the ideal cryptographic hash function as having four main or significant properties:
The first property may be true, of course, but the rest will not be. Inverting the hash function is as simple as solving a system of linear equations, which is easy. I assume you've done this for linear maps over the real numbers, but the exact same approach works here.
If you find an element of the kernel of the hash function, that is, a message K
such that Hash(K)
is all zeros, then the last property fails also. Take any message M
; then M
and M^K
will have the same hash, because Hash(M^K) = Hash(M)^Hash(K) = Hash(M)^0 = Hash(M)
. Finding elements of the kernel is easy, also.
The third property is a little more difficult, but can be broken also. (For example, suppose you're hashing a legal contract. Find a few places where some stray commas can be modified or something. Consider the effect that these changes have on the hash function, and then solve a system of linear equations.)
What you want is called a Homomorphic Hash. I'm not up with the latest developments, but the one I saw described is extremely - almost unfeasibly - slow to compute. The original paper is here, and a followup with some refinements to its use is here.
As for combining blocks, the hash typically requires you use addition in a prime field. If you're using fountain codes, you don't have to use xor, though - any reversible function is fine, and that includes addition. The hash described above works on addition and multiplication in a prime field, and is provably secure.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With