Inside the Rails code, people tend to use the Enumerable#inject
method to create hashes, like this:
somme_enum.inject({}) do |hash, element| hash[element.foo] = element.bar hash end
While this appears to have become a common idiom, does anyone see an advantage over the "naive" version, which would go like:
hash = {} some_enum.each { |element| hash[element.foo] = element.bar }
The only advantage I see for the first version is that you do it in a closed block and you don't (explicitly) initialize the hash. Otherwise it abuses a method unexpectedly, is harder to understand and harder to read. So why is it so popular?
Definition of in use : being used All of the computers are currently in use.
The word “is” is always used as a verb in written and spoken English. This word is considered as a verb because it expresses existence or a state of being. It is classified under linking verbs and is a derivative of the verb “to be.” In the sample sentence: He is the most intelligent student in class.
The word “there” is a commonly used word that can be difficult to classify because of the various roles it can play in a sentence. There can be used as an adverb, pronoun, noun, or adjective, and sometimes as an interjection.
Obey my instructions, as in Never mind about the other mothers—you do as I say. This admonitory order is sometimes followed by a self-deprecating phrase, Do as I say, not as I do, meaning “don't imitate my behavior but obey my instructions.” This order first appeared in John Selden's Table-Talk (c.
As Aleksey points out, Hash#update()
is slower than Hash#store()
, but that got me thinking about the overall efficiency of #inject()
vs a straight #each
loop, so I benchmarked a few things:
require 'benchmark' module HashInject extend self PAIRS = 1000.times.map {|i| [sprintf("s%05d",i).to_sym, i]} def inject_store PAIRS.inject({}) {|hash, sym, val| hash[sym] = val ; hash } end def inject_update PAIRS.inject({}) {|hash, sym, val| hash.update(val => hash) } end def each_store hash = {} PAIRS.each {|sym, val| hash[sym] = val } hash end def each_update hash = {} PAIRS.each {|sym, val| hash.update(val => hash) } hash end def each_with_object_store PAIRS.each_with_object({}) {|pair, hash| hash[pair[0]] = pair[1]} end def each_with_object_update PAIRS.each_with_object({}) {|pair, hash| hash.update(pair[0] => pair[1])} end def by_initialization Hash[PAIRS] end def tap_store {}.tap {|hash| PAIRS.each {|sym, val| hash[sym] = val}} end def tap_update {}.tap {|hash| PAIRS.each {|sym, val| hash.update(sym => val)}} end N = 10000 Benchmark.bmbm do |x| x.report("inject_store") { N.times { inject_store }} x.report("inject_update") { N.times { inject_update }} x.report("each_store") { N.times {each_store }} x.report("each_update") { N.times {each_update }} x.report("each_with_object_store") { N.times {each_with_object_store }} x.report("each_with_object_update") { N.times {each_with_object_update }} x.report("by_initialization") { N.times {by_initialization}} x.report("tap_store") { N.times {tap_store }} x.report("tap_update") { N.times {tap_update }} end end
And the results:
Rehearsal ----------------------------------------------------------- inject_store 10.510000 0.120000 10.630000 ( 10.659169) inject_update 8.490000 0.190000 8.680000 ( 8.696176) each_store 4.290000 0.110000 4.400000 ( 4.414936) each_update 12.800000 0.340000 13.140000 ( 13.188187) each_with_object_store 5.250000 0.110000 5.360000 ( 5.369417) each_with_object_update 13.770000 0.340000 14.110000 ( 14.166009) by_initialization 3.040000 0.110000 3.150000 ( 3.166201) tap_store 4.470000 0.110000 4.580000 ( 4.594880) tap_update 12.750000 0.340000 13.090000 ( 13.114379) ------------------------------------------------- total: 77.140000sec user system total real inject_store 10.540000 0.110000 10.650000 ( 10.674739) inject_update 8.620000 0.190000 8.810000 ( 8.826045) each_store 4.610000 0.110000 4.720000 ( 4.732155) each_update 12.630000 0.330000 12.960000 ( 13.016104) each_with_object_store 5.220000 0.110000 5.330000 ( 5.338678) each_with_object_update 13.730000 0.340000 14.070000 ( 14.102297) by_initialization 3.010000 0.100000 3.110000 ( 3.123804) tap_store 4.430000 0.110000 4.540000 ( 4.552919) tap_update 12.850000 0.330000 13.180000 ( 13.217637) => true
Enumerable#each
is faster than Enumerable#inject
, and Hash#store
is faster than Hash#update
. But the fastest of all is to pass an array in at initialization time:
Hash[PAIRS]
If you're adding elements after the hash has been created, the winning version is exactly what the OP was suggesting:
hash = {} PAIRS.each {|sym, val| hash[sym] = val } hash
But in that case, if you're a purist who wants a single lexical form, you can use #tap
and #each
and get the same speed:
{}.tap {|hash| PAIRS.each {|sym, val| hash[sym] = val}}
For those not familiar with tap
, it creates a binding of the receiver (the new hash) inside the body, and finally returns the receiver (the same hash). If you know Lisp, think of it as Ruby's version of LET binding.
Since people have asked, here's the testing environment:
# Ruby version ruby 2.0.0p247 (2013-06-27) [x86_64-darwin12.4.0] # OS Mac OS X 10.9.2 # Processor/RAM 2.6GHz Intel Core i7 / 8GB 1067 MHz DDR3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With