I'm trying to create a custom hashing function for strings. I want to hash strings by their character frequency by weight. So that hi
and ih
will yield the same hash. Can I override __hash__
?
Or is creating a wrapper class that holds the string and overriding __hash__
and __eq__
the only way?
You want a derived type with different equality semantics. Usually the approach taken will be to define how equality works, then build the hash method from the structures derived there, since it's neccesary that the hash agree with equality. That might be:
import collections
class FrequencyString(str):
@property
def normalized(self):
try:
return self._normalized
except AttributeError:
self._normalized = normalized = ''.join(sorted(collections.Counter(self).elements()))
return normalized
def __eq__(self, other):
return self.normalized == other.normalized
def __hash__(self):
return hash(self.normalized)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With