This is the opposite of Turning a Hash of Arrays into an Array of Hashes in Ruby.
Elegantly and/or efficiently turn an array of hashes into a hash where the values are arrays of all values:
hs = [ { a:1, b:2 }, { a:3, c:4 }, { b:5, d:6 } ] collect_values( hs ) #=> { :a=>[1,3], :b=>[2,5], :c=>[4], :d=>[6] }
This terse code almost works, but fails to create an array when there are no duplicates:
def collect_values( hashes ) hashes.inject({}){ |a,b| a.merge(b){ |_,x,y| [*x,*y] } } end collect_values( hs ) #=> { :a=>[1,3], :b=>[2,5], :c=>4, :d=>6 }
This code works, but can you write a better version?
def collect_values( hashes ) # Requires Ruby 1.8.7+ for Object#tap Hash.new{ |h,k| h[k]=[] }.tap do |result| hashes.each{ |h| h.each{ |k,v| result[k]<<v } } end end
Solutions that only work in Ruby 1.9 are acceptable, but should be noted as such.
Here are the results of benchmarking the various answers below (and a few more of my own), using three different arrays of hashes:
one where each hash has distinct keys, so no merging ever occurs:[{:a=>1}, {:b=>2}, {:c=>3}, {:d=>4}, {:e=>5}, {:f=>6}, {:g=>7}, ...]
one where every hash has the same key, so maximum merging occurs:[{:a=>1}, {:a=>2}, {:a=>3}, {:a=>4}, {:a=>5}, {:a=>6}, {:a=>7}, ...]
[{:c=>1}, {:d=>1}, {:c=>2}, {:f=>1}, {:c=>1, :d=>1}, {:h=>1}, {:c=>3}, ...]
user system total real Phrogz 2a 0.577000 0.000000 0.577000 ( 0.576000) Phrogz 2b 0.624000 0.000000 0.624000 ( 0.620000) Glenn 1 0.640000 0.000000 0.640000 ( 0.641000) Phrogz 1 0.671000 0.000000 0.671000 ( 0.668000) Michael 1 0.702000 0.000000 0.702000 ( 0.700000) Michael 2 0.717000 0.000000 0.717000 ( 0.726000) Glenn 2 0.765000 0.000000 0.765000 ( 0.764000) fl00r 0.827000 0.000000 0.827000 ( 0.836000) sawa 0.874000 0.000000 0.874000 ( 0.868000) Tokland 1 0.873000 0.000000 0.873000 ( 0.876000) Tokland 2 1.077000 0.000000 1.077000 ( 1.073000) Phrogz 3 2.106000 0.093000 2.199000 ( 2.209000)
The fastest code is this method that I added:
def collect_values(hashes) {}.tap{ |r| hashes.each{ |h| h.each{ |k,v| (r[k]||=[]) << v } } } end
I've accepted "glenn mcdonald's answer" as it was competitive in terms of speed, reasonably terse, but (most importantly) because it pointed out the danger of using a Hash with a self-modifying default proc for convenient construction, as this may introduce bad changes when the user is indexing it later on.
Finally, here's the benchmark code, in case you want to run your own comparisons:
require 'prime' # To generate the third hash require 'facets' # For tokland1's map_by AZSYMBOLS = (:a..:z).to_a TESTS = { '26 Distinct Hashes' => AZSYMBOLS.zip(1..26).map{|a| Hash[*a] }, '26 Same-Key Hashes' => ([:a]*26).zip(1..26).map{|a| Hash[*a] }, '26 Mixed-Keys Hashes' => (2..27).map do |i| factors = i.prime_division.transpose Hash[AZSYMBOLS.values_at(*factors.first).zip(factors.last)] end } def phrogz1(hashes) Hash.new{ |h,k| h[k]=[] }.tap do |result| hashes.each{ |h| h.each{ |k,v| result[k]<<v } } end end def phrogz2a(hashes) {}.tap{ |r| hashes.each{ |h| h.each{ |k,v| (r[k]||=[]) << v } } } end def phrogz2b(hashes) hashes.each_with_object({}){ |h,r| h.each{ |k,v| (r[k]||=[]) << v } } end def phrogz3(hashes) result = hashes.inject({}){ |a,b| a.merge(b){ |_,x,y| [*x,*y] } } result.each{ |k,v| result[k] = [v] unless v.is_a? Array } end def glenn1(hs) hs.reduce({}) {|h,pairs| pairs.each {|k,v| (h[k] ||= []) << v}; h} end def glenn2(hs) hs.map(&:to_a).flatten(1).reduce({}) {|h,(k,v)| (h[k] ||= []) << v; h} end def fl00r(hs) h = Hash.new{|h,k| h[k]=[]} hs.map(&:to_a).flatten(1).each{|v| h[v[0]] << v[1]} h end def sawa(a) a.map(&:to_a).flatten(1).group_by{|k,v| k}.each_value{|v| v.map!{|k,v| v}} end def michael1(hashes) h = Hash.new{|h,k| h[k]=[]} hashes.each_with_object(h) do |h, result| h.each{ |k, v| result[k] << v } end end def michael2(hashes) h = Hash.new{|h,k| h[k]=[]} hashes.inject(h) do |result, h| h.each{ |k, v| result[k] << v } result end end def tokland1(hs) hs.map(&:to_a).flatten(1).map_by{ |k, v| [k, v] } end def tokland2(hs) Hash[hs.map(&:to_a).flatten(1).group_by(&:first).map{ |k, vs| [k, vs.map{|o|o[1]}] }] end require 'benchmark' N = 10_000 Benchmark.bm do |x| x.report('Phrogz 2a'){ TESTS.each{ |n,h| N.times{ phrogz2a(h) } } } x.report('Phrogz 2b'){ TESTS.each{ |n,h| N.times{ phrogz2b(h) } } } x.report('Glenn 1 '){ TESTS.each{ |n,h| N.times{ glenn1(h) } } } x.report('Phrogz 1 '){ TESTS.each{ |n,h| N.times{ phrogz1(h) } } } x.report('Michael 1'){ TESTS.each{ |n,h| N.times{ michael1(h) } } } x.report('Michael 2'){ TESTS.each{ |n,h| N.times{ michael2(h) } } } x.report('Glenn 2 '){ TESTS.each{ |n,h| N.times{ glenn2(h) } } } x.report('fl00r '){ TESTS.each{ |n,h| N.times{ fl00r(h) } } } x.report('sawa '){ TESTS.each{ |n,h| N.times{ sawa(h) } } } x.report('Tokland 1'){ TESTS.each{ |n,h| N.times{ tokland1(h) } } } x.report('Tokland 2'){ TESTS.each{ |n,h| N.times{ tokland2(h) } } } x.report('Phrogz 3 '){ TESTS.each{ |n,h| N.times{ phrogz3(h) } } } end
Hash#merge!() is a Hash class method which can add the content the given hash array to the other. Entries with duplicate keys are overwritten with the values from each other_hash successively if no block is given.
The to_h method is defined in the array class. It works to convert an array to a hash in the form of key-value pairs. The method converts each nested array into key-value pairs. The method also accepts a block.
With arrays, the key is an integer, whereas hashes support any object as a key. Both arrays and hashes grow as needed to hold new elements. It's more efficient to access array elements, but hashes provide more flexibility.
An array of hashes is useful when you have a bunch of records that you'd like to access sequentially, and each record itself contains key/value pairs. Arrays of hashes are used less frequently than the other structures in this chapter.
Take your pick:
hs.reduce({}) {|h,pairs| pairs.each {|k,v| (h[k] ||= []) << v}; h} hs.map(&:to_a).flatten(1).reduce({}) {|h,(k,v)| (h[k] ||= []) << v; h}
I'm strongly against messing with the defaults for hashes, as the other suggestions do, because then checking for a value modifies the hash, which seems very wrong to me.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With