How to merge array of hashes to get hash of arrays of values

Tags:

This is the opposite of Turning a Hash of Arrays into an Array of Hashes in Ruby.

Elegantly and/or efficiently turn an array of hashes into a hash where the values are arrays of all values:

hs = [   { a:1, b:2 },   { a:3, c:4 },   { b:5, d:6 } ] collect_values( hs ) #=> { :a=>[1,3], :b=>[2,5], :c=>[4], :d=>[6] }

This terse code almost works, but fails to create an array when there are no duplicates:

def collect_values( hashes )   hashes.inject({}){ |a,b| a.merge(b){ |_,x,y| [*x,*y] } } end collect_values( hs ) #=> { :a=>[1,3], :b=>[2,5], :c=>4, :d=>6 }

This code works, but can you write a better version?

def collect_values( hashes )   # Requires Ruby 1.8.7+ for Object#tap   Hash.new{ |h,k| h[k]=[] }.tap do |result|     hashes.each{ |h| h.each{ |k,v| result[k]<<v } }   end end

Solutions that only work in Ruby 1.9 are acceptable, but should be noted as such.

Here are the results of benchmarking the various answers below (and a few more of my own), using three different arrays of hashes:

one where each hash has distinct keys, so no merging ever occurs:
[{:a=>1}, {:b=>2}, {:c=>3}, {:d=>4}, {:e=>5}, {:f=>6}, {:g=>7}, ...]
one where every hash has the same key, so maximum merging occurs:
[{:a=>1}, {:a=>2}, {:a=>3}, {:a=>4}, {:a=>5}, {:a=>6}, {:a=>7}, ...]
and one that is a mix of unique and shared keys:
[{:c=>1}, {:d=>1}, {:c=>2}, {:f=>1}, {:c=>1, :d=>1}, {:h=>1}, {:c=>3}, ...]

                user     system      total        real Phrogz 2a  0.577000   0.000000   0.577000 (  0.576000) Phrogz 2b  0.624000   0.000000   0.624000 (  0.620000) Glenn 1    0.640000   0.000000   0.640000 (  0.641000) Phrogz 1   0.671000   0.000000   0.671000 (  0.668000) Michael 1  0.702000   0.000000   0.702000 (  0.700000) Michael 2  0.717000   0.000000   0.717000 (  0.726000) Glenn 2    0.765000   0.000000   0.765000 (  0.764000) fl00r      0.827000   0.000000   0.827000 (  0.836000) sawa       0.874000   0.000000   0.874000 (  0.868000) Tokland 1  0.873000   0.000000   0.873000 (  0.876000) Tokland 2  1.077000   0.000000   1.077000 (  1.073000) Phrogz 3   2.106000   0.093000   2.199000 (  2.209000)

The fastest code is this method that I added:

def collect_values(hashes)   {}.tap{ |r| hashes.each{ |h| h.each{ |k,v| (r[k]||=[]) << v } } } end

I've accepted "glenn mcdonald's answer" as it was competitive in terms of speed, reasonably terse, but (most importantly) because it pointed out the danger of using a Hash with a self-modifying default proc for convenient construction, as this may introduce bad changes when the user is indexing it later on.

Finally, here's the benchmark code, in case you want to run your own comparisons:

require 'prime'   # To generate the third hash require 'facets'  # For tokland1's map_by AZSYMBOLS = (:a..:z).to_a TESTS = {   '26 Distinct Hashes'   => AZSYMBOLS.zip(1..26).map{|a| Hash[*a] },   '26 Same-Key Hashes'   => ([:a]*26).zip(1..26).map{|a| Hash[*a] },   '26 Mixed-Keys Hashes' => (2..27).map do |i|     factors = i.prime_division.transpose     Hash[AZSYMBOLS.values_at(*factors.first).zip(factors.last)]   end }  def phrogz1(hashes)   Hash.new{ |h,k| h[k]=[] }.tap do |result|     hashes.each{ |h| h.each{ |k,v| result[k]<<v } }   end end def phrogz2a(hashes)   {}.tap{ |r| hashes.each{ |h| h.each{ |k,v| (r[k]||=[]) << v } } } end def phrogz2b(hashes)   hashes.each_with_object({}){ |h,r| h.each{ |k,v| (r[k]||=[]) << v } } end def phrogz3(hashes)   result = hashes.inject({}){ |a,b| a.merge(b){ |_,x,y| [*x,*y] } }   result.each{ |k,v| result[k] = [v] unless v.is_a? Array } end def glenn1(hs)   hs.reduce({}) {|h,pairs| pairs.each {|k,v| (h[k] ||= []) << v}; h} end def glenn2(hs)   hs.map(&:to_a).flatten(1).reduce({}) {|h,(k,v)| (h[k] ||= []) << v; h} end def fl00r(hs)   h = Hash.new{|h,k| h[k]=[]}   hs.map(&:to_a).flatten(1).each{|v| h[v[0]] << v[1]}   h end def sawa(a)   a.map(&:to_a).flatten(1).group_by{|k,v| k}.each_value{|v| v.map!{|k,v| v}} end def michael1(hashes)   h = Hash.new{|h,k| h[k]=[]}   hashes.each_with_object(h) do |h, result|     h.each{ |k, v| result[k] << v }   end end def michael2(hashes)   h = Hash.new{|h,k| h[k]=[]}   hashes.inject(h) do |result, h|     h.each{ |k, v| result[k] << v }     result   end end def tokland1(hs)   hs.map(&:to_a).flatten(1).map_by{ |k, v| [k, v] } end def tokland2(hs)   Hash[hs.map(&:to_a).flatten(1).group_by(&:first).map{ |k, vs|     [k, vs.map{|o|o[1]}]   }] end  require 'benchmark' N = 10_000 Benchmark.bm do |x|   x.report('Phrogz 2a'){ TESTS.each{ |n,h| N.times{ phrogz2a(h) } } }   x.report('Phrogz 2b'){ TESTS.each{ |n,h| N.times{ phrogz2b(h) } } }   x.report('Glenn 1  '){ TESTS.each{ |n,h| N.times{ glenn1(h)   } } }   x.report('Phrogz 1 '){ TESTS.each{ |n,h| N.times{ phrogz1(h)  } } }   x.report('Michael 1'){ TESTS.each{ |n,h| N.times{ michael1(h) } } }   x.report('Michael 2'){ TESTS.each{ |n,h| N.times{ michael2(h) } } }   x.report('Glenn 2  '){ TESTS.each{ |n,h| N.times{ glenn2(h)   } } }   x.report('fl00r    '){ TESTS.each{ |n,h| N.times{ fl00r(h)    } } }   x.report('sawa     '){ TESTS.each{ |n,h| N.times{ sawa(h)     } } }   x.report('Tokland 1'){ TESTS.each{ |n,h| N.times{ tokland1(h) } } }   x.report('Tokland 2'){ TESTS.each{ |n,h| N.times{ tokland2(h) } } }   x.report('Phrogz 3 '){ TESTS.each{ |n,h| N.times{ phrogz3(h)  } } }  end

731

asked Mar 30 '11 19:03

Phrogz

1 Answers

Take your pick:

hs.reduce({}) {|h,pairs| pairs.each {|k,v| (h[k] ||= []) << v}; h}  hs.map(&:to_a).flatten(1).reduce({}) {|h,(k,v)| (h[k] ||= []) << v; h}

I'm strongly against messing with the defaults for hashes, as the other suggestions do, because then checking for a value modifies the hash, which seems very wrong to me.

108

answered Oct 10 '22 16:10

glenn mcdonald

Related questions
                            
                                How to overwrite a printed line in the shell with Ruby?
                            
                                Rails.cache error in Rails 3.1 - TypeError: can't dump hash with default proc
                            
                                How do I display Ruby on Rails form validation error messages one at a time?
                            
                                Time ISO 8601 in Ruby
                            
                                How to order included elements in Rails 3
                            
                                How to get request's target controller and action with Rails 3?
                            
                                Rails: Is Date.today in UTC?
                            
                                What is the easiest way I can create a 'beep' sound from a Ruby program?
                            
                                comparing two strings in ruby [closed]
                            
                                Ruby code beautification, split long instructions on multiple lines
                            
                                RVM installed by Ruby not working?
                            
                                rails respond_to format.js API
                            
                                Ruby on Rails: How can I edit database.yml for postgresql?
                            
                                Find key/value pairs deep inside a hash containing an arbitrary number of nested hashes and arrays
                            
                                Ruby convention for chaining calls over multiple lines
                            
                                Parallel processing from a command queue on Linux (bash, python, ruby... whatever)
                            
                                hash['key'] to hash.key in Ruby
                            
                                Vim slow with ruby syntax highlighting
                            
                                Can not install json gem with ruby 2.2.3 on Ubuntu
                            
                                How do I sort in ruby/rails on two fields?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to merge array of hashes to get hash of arrays of values

Tags:

arrays

merge

ruby

hash

Phrogz

People also ask

1 Answers

glenn mcdonald

Recent Activity

Donate For Us