How can I calculate a hash/checksum/fingerprint of an object in c++?
Requirements:
The function must be 'injective'(*). In other words, there should be no two different input objects, that return the same hash/checksum/fingerprint.
Background:
I am trying to come up with a simple pattern for checking whether or not an entity object has been changed since it was constructed. (In order to know which objects need to be updated in the database).
Note that I specifically do not want to mark the object as changed in my setters or anywhere else.
I am considering the following pattern: In short, every entity object that should be persisted, has a member function "bool is_changed()". Changed, in this context, means changed since the objects' constructor was called.
Note: My motivation for all this is to avoid the boilerplate code that comes with marking objects as clean/dirty or doing a member by member comparison. In other words, reduce risk of human error.
(Warning: psudo c++ code ahead. I have not tried compiling it).
class Foo {
private:
std::string my_string;
// Assume the "fingerprint" is of type long.
long original_fingerprint;
long current_fingerprint()
{
// *** Suggestions on which algorithm to use here? ***
}
public:
Foo(const std::string& my_string) :
my_string(my_string)
{
original_fingerprint = current_fingerprint();
}
bool is_changed() const
{
// If new calculation of fingerprint is different from the one
// calculated in the constructor, then the object has
// been changed in some way.
return current_fingerprint() != original_fingerprint;
}
void set_my_string(const std::string& new_string)
{
my_string = new_string;
}
}
void client_code()
{
auto foo = Foo("Initial string");
// should now return **false** because
// the object has not yet been changed:
foo.is_changed();
foo.set_my_string("Changed string");
// should now return **true** because
// the object has been changed:
foo.is_changed();
}
(*) In practice, not necessarily in theory (like uuids are not unique in theory).
You can use the CRC32 algorithm from Boost. Feed it with the memory locations of the data you want to checksum. You could use a hash for this, but hashes are cryptographic functions intended to guard against intentional data corruption and are slower. A CRC performs better.
For this example, I've added another data member to Foo
:
int my_integer;
And this is how you would checksum both my_string
and my_integer
:
#include <boost/crc.hpp>
// ...
long current_fingerprint()
{
boost::crc_32_type crc32;
crc32.process_bytes(my_string.data(), my_string.length());
crc32.process_bytes(&my_integer, sizeof(my_integer));
return crc32.checksum();
}
However, now we're left with the issue of two objects having the same fingerprint if my_string
and my_integer
are equal. To fix this, we should include the address of the object in the CRC, since C++ guarantees that different objects will have different addresses.
One would think we can use:
process_bytes(&this, sizeof(this));
to do it, but we can't since this
is an rvalue and thus we can't take its address. So we need to store the address in a variable instead:
long current_fingerprint()
{
boost::crc_32_type crc32;
void* this_ptr = this;
crc32.process_bytes(&this_ptr, sizeof(this_ptr));
crc32.process_bytes(my_string.data(), my_string.length());
crc32.process_bytes(&my_integer, sizeof(my_integer));
return crc32.checksum();
}
Such a function does not exist, at least not in the context that you are requesting.
The STL provides hash functions for basic types (std::hash), and you could use these to implement a hash function for your objects using any reasonable hashing algorithm.
However, you seem to be looking for an injective function, which causes a problem. Essentially, to have an injective function, it would be necessary to have an output of size greater or equal to that of the object you are considering, since otherwise (from the pigeon hole principle) there would be two inputs that give the same output. Given that, the most sensible option would be to just do a straight-up comparison of the object to some sort of reference object.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With