I'm creating keys for key-value data by taking a few (<10) pieces of information that identify the data and producing one hash from them combined. For this, I've been using CryptoPP's SHA256::Update
function, which lets you add pieces at a time:
#include "sha.h"
...
byte outputBuf[CryptoPP::SHA256::DIGESTSIZE];
CryptoPP::SHA256 hash;
hash.Update(pData1, lenData1); // pData* can point to int, double or std::string
hash.Update(pData2, lenData2);
...
hash.Final(outputBuf);
I've noticed that the order of the calls to Update
matters (i.e. if you change the order of the two Update
statements, you'll get a different hash). I would like this to be order independent instead. So:
xor
to combine the parameters would work. One problem is that if two pieces of data are the same, they'll cancel out. Can you foresee problems with this?The comment saying that xor will increase the number of collisions is only true, if you consider {1, 2}
and {2, 1}
to be different inputs. I guess, you don't, as otherwise you wouldn't want an order-independent hash. So h({1, 2}) = h({2, 1})
is no collision as you're providing the same input.
The simplest solution is sorting and than using your favorite hash function. It is as secure as your hash function (confirm on crypto.stackexchange.com if you care).
Xoring hashes is definitely a bad idea as two equal elements cancel out. Adding them is much better, but with two equal elements, the least significant bit will be zero (with four such elements, two bits will be zero, etc.). This may be acceptable.
Note that any such method is pretty insecure as it allows finding collisions much faster (proof upon request). You may or may not need security, but don't try to invent a secure method, as it's practically impossible (every well-known hash function has many man-months of analysis behind it).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With