In this presentation the speaker has created a value class.
In implementing it, he overrides #eql?
and says that in Java development, the idiom is that whenever you override #eql?
you must override #hash
.
class Weight
# ...
def hash
pounds.hash
end
def eql?(other)
self.class == other.class &&
self.pounds == other.pounds
end
alias :== eql?
end
Firstly, what is the #hash
method? I can see it returns an integer.
> 1.hash
=> -3708808305943022538
> 2.hash
=> 1196896681607723080
> 1.hash
=> -3708808305943022538
Using pry I can see that an integer responds to #hash
but I cannot see where it inherits the method from. It's not defined on Numeric
or Object
. If I knew what this method did, I would probably understand why it needs to be overridden at the same time as #eql?
.
So, why does #hash
need to be overridden whenever eql?
is overridden?
Firstly, what is the
#hash
method? I can see it returns an integer.
The #hash
method is supposed to return a hash of the receiver. (The name of the method is a bit of a giveaway).
Using pry I can see that an integer responds to
#hash
but I cannot see where it inherits the method from.
There are dozens of questions of the type "Where does this method come from" on [so], and the answer is always the same: the best way to know where a method comes from, is to simply ask it:
hash_method = 1.method(:hash)
hash_method.owner #=> Kernel
So, #hash
is inherited from Kernel
. Note however, that there is a bit of a peculiar relationship between Object
and Kernel
, in that some methods that are implemented in Kernel
are documented in Object
or vice versa. This probably has historic reasons, and is now an unfortunate fact of life in the Ruby community.
Unfortunately, for reasons I don't understand, the documentation for Object#hash
was deleted in 2017 in a commit ironically titled "Add documents". It is, however, still available in Ruby 2.4 (bold emphasis mine):
hash
→integer
Generates an Integer hash value for this object. This function must have the property that
a.eql?(b)
impliesa.hash == b.hash
.The hash value is used along with eql? by the Hash class to determine if two objects reference the same hash key. […]
So, as you can see, there is a deep and important relationship between #eql?
and #hash
, and in fact the correct behavior of methods that use #eql?
and #hash
depends on the fact that this relationship is maintained.
So, we know that the method is called #hash
and thus likely computes a hash. We know it is used together with eql?
, and we know that it is used in particular by the Hash
class.
What does it do, exactly? Well, we all know what a hash function is: it is a function that maps a larger, potentially infinite, input space into a smaller, finite, output space. In particular, in this case, the input space is the space of all Ruby objects, and the output space is the "fast integers" (i.e. the ones that used to be called Fixnum
).
And we know how a hash table works: values are placed in buckets based on the hash of their keys, if I want to find a value, then I only need to compute the hash of the key (which is fast) and know which bucket I find the value in (in constant time), as opposed to e.g. an array of key-value-pairs, where I need to compare the key against every key in the array (linear search) to find the value.
However, there is a problem: Since the output space of a hash is smaller than the input space, there are different objects which have the same hash value and thus end up in the same bucket. Thus, when two objects have different hash values, I know for a fact that they are different, but if they have the same hash value, then they could still be different, and I need to compare them for equality to be sure – and that's where the relationship between hash and equality comes from. Also note that when many keys and up in the same bucket, I will again have to compare the search key against every key in the bucket (linear search) to find the value.
From all this we can conclude the following properties of the #hash
method:
Integer
.Fixnum
s).Hash
may degenerate into a linked list with highly degraded performance.)Hash
to degenerate into a linked list as a form of Degradation-of-Service attack.)If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With