When iterating through a hash like below:
hash.keys.each do |key|
process_key(key)
end
Rubocop proclaims I should use:
each_key
instead of:
keys.each
What is the "key" difference between keys.each
and each_key
?
In Ruby, Hash is a collection of unique keys and their values. Hash is like an Array, except the indexing is done with the help of arbitrary keys of any object type. In Hash, the order of returning keys and their value by various iterators is arbitrary and will generally not be in the insertion order.
Overview. We can check if a particular hash contains a particular key by using the method has_key?(key) . It returns true or false depending on whether the key exists in the hash or not.
In Ruby, a hash is a collection of key-value pairs. A hash is denoted by a set of curly braces ( {} ) which contains key-value pairs separated by commas. Each value is assigned to a key using a hash rocket ( => ). Calling the hash followed by a key name within brackets grabs the value associated with that key.
Rubocop wants you to follow this based on your evaluated code due to performance. Using large sets of data is where this becomes noticeable. Here's the doc on it: https://github.com/bbatsov/rubocop/blob/master/manual/cops_performance.md#performancehasheachmethods
I also found a benchmark someone wrote up to test this: https://gist.github.com/jodosha/8ca2bee6137be94e9dcb
I modified it a bit and ran it on on of my systems:
Warming up --------------------------------------
string each 128.742k i/100ms
string keys 114.523k i/100ms
string each_key 134.279k i/100ms
symbol each 128.838k i/100ms
symbol keys 109.398k i/100ms
symbol each_key 132.021k i/100ms
Calculating -------------------------------------
string each 2.053M (± 4.0%) i/s - 10.299M in 5.026890s
string keys 1.864M (± 1.4%) i/s - 9.391M in 5.039759s
string each_key 2.224M (± 5.5%) i/s - 11.145M in 5.032201s
symbol each 2.082M (± 1.0%) i/s - 10.436M in 5.013145s
symbol keys 1.815M (± 2.1%) i/s - 9.080M in 5.004690s
symbol each_key 2.240M (± 1.9%) i/s - 11.222M in 5.012184s
Comparison:
symbol each_key: 2239720.0 i/s
string each_key: 2224205.1 i/s - same-ish: difference falls within error
symbol each: 2081895.2 i/s - 1.08x slower
string each: 2052884.9 i/s - 1.09x slower
string keys: 1863740.5 i/s - 1.20x slower
symbol keys: 1815131.1 i/s - 1.23x slower
Chaining methods is going to be slower than using the built-in method (in this case) which accomplishes the task with a single special enumerator. The language creators put it there for a reason, also its idiomatic.
Rubocop is wrong. Which one you should use depends on what you want as the return value.
keys.each
. key
creates a new array of keys, and each
returns that array of keys after performing the block on each key.each_key
since that does not create an array that would not be used, and would be more efficient.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With