I just noticed that the return value of #hash
changes each time I start up Ruby:
$ irb
2.0.0-p353 :001 > "".hash
2313425349783613115
2.0.0-p353 :002 > exit
$ irb
2.0.0-p353 :001 > "".hash
4543564897974813688
2.0.0-p353 :002 > exit
I looked at the MRI source to see why this was happening:
st_index_t
rb_str_hash(VALUE str)
{
int e = ENCODING_GET(str);
if (e && rb_enc_str_coderange(str) == ENC_CODERANGE_7BIT) {
e = 0;
}
return rb_memhash((const void *)RSTRING_PTR(str), RSTRING_LEN(str)) ^ e;
}
It turns out rb_memhash
is defined in random.c
:
st_index_t
rb_memhash(const void *ptr, long len)
{
sip_uint64_t h = sip_hash24(sipseed.key, ptr, len);
#ifdef HAVE_UINT64_T
return (st_index_t)h;
#else
return (st_index_t)(h.u32[0] ^ h.u32[1]);
#endif
}
And though I can't find what ruby_sip_hash24
is, I assume that it's not a deterministic function.
After a bit of messing around, I managed to find this commit by Tanaka Akira that changes rb_str_hash
to use rb_memhash
due to "avoid algorithmic complexity attacks". What does that mean?
Thanks!
As the commit message said, it is due to avoid algorithmic complexity attacks.
An algorithmic complexity attack is a form of computer attack that exploits known cases in which an algorithm used in a piece of software will exhibit worst case behavior. This type of attack can be used to achieve a denial-of-service.
By using rb_memhash
, the hash result will be randomized every time you start a new ruby execution context. Otherwise, if is not randomized, the attacker know the algorithm and could find out the worst case behavior which could used as the DoS Attack.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With