I found when I new a Hash which has seven objects is much slower than six length hash. I know that the length of hash will affect the performance. But I don't know why seven is a special one.
Here is the code of benchmark(Ruby 2.2.3
):
require 'benchmark/ips'
Benchmark.ips do |x|
x.report(5) { { a: 0, b: 1, c: 2, d: 3, e: 4 } }
x.report(6) { { a: 0, b: 1, c: 2, d: 3, e: 4, f: 5 } }
x.report(7) { { a: 0, b: 1, c: 2, d: 3, e: 4, f: 5, g: 6 } }
x.report(8) { { a: 0, b: 1, c: 2, d: 3, e: 4, f: 5, g: 6, h: 7 } }
x.report(9) { { a: 0, b: 1, c: 2, d: 3, e: 4, f: 5, g: 6, h: 7, i: 8 } }
x.compare!
end
And the blow is the results:
Calculating -------------------------------------
5 65.986k i/100ms
6 63.966k i/100ms
7 30.713k i/100ms
8 28.991k i/100ms
9 27.115k i/100ms
-------------------------------------------------
5 1.243M (± 4.3%) i/s - 6.203M
6 1.202M (± 5.3%) i/s - 6.013M
7 373.366k (±13.7%) i/s - 1.843M
8 351.945k (± 8.8%) i/s - 1.768M
9 331.398k (± 8.2%) i/s - 1.654M
Comparison:
5: 1243005.5 i/s
6: 1202032.4 i/s - 1.03x slower
7: 373366.5 i/s - 3.33x slower
8: 351945.1 i/s - 3.53x slower
9: 331398.3 i/s - 3.75x slower
Notice that many of the hash types create a hash of a different length. Why? Many of the hashes use a different number of bits to produce the hash.
It's just that the constant of the hash table is quite large.
A hash is a mathematical function that converts an input of arbitrary length into an encrypted output of a fixed length. Thus regardless of the original amount of data or file size involved, its unique hash will always be the same size.
The longest hash is the most secure since the probability of randomly finding a collision is lower. But a longer hash also takes longer to compute, to check, especially if a human has to check it, so it may not be worth having a longer hash for the tiny security improvement.
From Hash lookup in Ruby, why is it so fast?:
Ruby manages the size of the bins dynamically. It starts with 11 and as soon as one of the bins has 5 or more elements, the bin size is increased and all hash elements are reallocated to their new corresponding bin.
At some point you pay an exponentially increased time penalty while Ruby resizes the bin pool, but if you think about it, its worth the time since this will keep lookup times and memory usage as low as possible.
This mean that the more bins, the less time spent looking for a specific key in a bin.
Hope it helps to understand the behaviour.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With