Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I create the intersection of two hashes?

Tags:

ruby

I have two hashes:

hash1 = {1 => "a" , 2 => "b" , 3 => "c" , 4 => "d"} 
hash2 = {3 => "hello", 4 => "world" , 5 => "welcome"} 

I need a hash which contains common keys in both hashes:

hash3 = {3 => "hello" , 4 => "world"}

Is it possible to do it without any loop?

like image 296
user2575339 Avatar asked Jul 12 '13 06:07

user2575339


4 Answers

hash3 = hash1.keep_if { |k, v| hash2.key? k }

This won't have the same effect as the code in the question, instead it will return:

hash3 #=> { 3 => "c", 4 => "d" }

The order of the hashes is important here. The values will always be taken from the hash that #keep_if is send to.

hash3 = hash2.keep_if { |k, v| hash1.key? k }
#=> {3 => "hello", 4 => "world"}
like image 170
Koraktor Avatar answered Nov 14 '22 06:11

Koraktor


I'd go with this:

hash1 = {1 => "a" , 2 => "b" , 3 => "c" , 4 => "d"} 
hash2 = {3 => "hello", 4 => "world" , 5 => "welcome"} 

Hash[(hash1.keys & hash2.keys).zip(hash2.values_at(*(hash1.keys & hash2.keys)))]
=> {3=>"hello", 4=>"world"}

Which can be reduced a bit to:

keys = (hash1.keys & hash2.keys)
Hash[keys.zip(hash2.values_at(*keys))]

The trick is in Array's & method. The documentation says:

Set Intersection — Returns a new array containing elements common to the two arrays, excluding any duplicates. The order is preserved from the original array.


Here are some benchmarks to show what is the most efficient way to do this:

require 'benchmark'

HASH1 = {1 => "a" , 2 => "b" , 3 => "c" , 4 => "d"} 
HASH2 = {3 => "hello", 4 => "world" , 5 => "welcome"} 

def tinman
  keys = (HASH1.keys & HASH2.keys)
  Hash[keys.zip(HASH2.values_at(*keys))]
end

def santhosh
  HASH2.select {|key, value| HASH1.has_key? key }
end
def santhosh_2
  HASH2.select {|key, value| HASH1[key] }
end

def priti
  HASH2.select{|k,v| HASH1.assoc(k) }
end

def koraktor
  HASH1.keep_if { |k, v| HASH2.key? k }
end
def koraktor2
  HASH2.keep_if { |k, v| HASH1.key? k }
end

N = 1_000_000
puts RUBY_VERSION
puts "N= #{N}"

puts [:tinman, :santhosh, :santhosh_2, :priti, :koraktor, :koraktor2].map{ |s| "#{s.to_s} = #{send(s)}" }
Benchmark.bm(11) do |x|
  x.report('tinman') { N.times { tinman() }}
  x.report('santhosh_2') { N.times { santhosh_2() }}
  x.report('santhosh') { N.times { santhosh() }}
  x.report('priti') { N.times { priti() }}
  x.report('koraktor') { N.times { koraktor() }}
  x.report('koraktor2') { N.times { koraktor2() }}
end

Ruby 1.9.3-p448:

1.9.3
N= 1000000
tinman = {3=>"hello", 4=>"world"}
santhosh = {3=>"hello", 4=>"world"}
santhosh_2 = {3=>"hello", 4=>"world"}
priti = {3=>"hello", 4=>"world"}
koraktor = {3=>"c", 4=>"d"}
koraktor2 = {3=>"hello", 4=>"world"}
                  user     system      total        real
tinman        2.430000   0.000000   2.430000 (  2.430030)
santhosh_2    1.000000   0.020000   1.020000 (  1.003635)
santhosh      1.090000   0.010000   1.100000 (  1.104067)
priti         1.350000   0.000000   1.350000 (  1.352476)
koraktor      0.490000   0.000000   0.490000 (  0.484686)
koraktor2     0.480000   0.000000   0.480000 (  0.483327)

Running under Ruby 2.0.0-p247:

2.0.0
N= 1000000
tinman = {3=>"hello", 4=>"world"}
santhosh = {3=>"hello", 4=>"world"}
santhosh_2 = {3=>"hello", 4=>"world"}
priti = {3=>"hello", 4=>"world"}
koraktor = {3=>"c", 4=>"d"}
koraktor2 = {3=>"hello", 4=>"world"}
                  user     system      total        real
tinman        1.890000   0.000000   1.890000 (  1.882352)
santhosh_2    0.710000   0.010000   0.720000 (  0.735830)
santhosh      0.790000   0.020000   0.810000 (  0.807413)
priti         1.030000   0.010000   1.040000 (  1.030018)
koraktor      0.390000   0.000000   0.390000 (  0.389431)
koraktor2     0.390000   0.000000   0.390000 (  0.389072)

Koraktor's original code doesn't work, but he turned it around nicely with his second code pass, and walks away with the best speed. I added the santhosh_2 method to see what effect removing key? would have. It sped the routine up a little, but not enough to catch up to Koraktor's.


Just for documentation purposes, I tweaked Koraktor's second code to remove the key? method also, and shaved more time from it. Here's the added method and the new output:

def koraktor3
  HASH2.keep_if { |k, v| HASH1[k] }
end

1.9.3
N= 1000000
tinman = {3=>"hello", 4=>"world"}
santhosh = {3=>"hello", 4=>"world"}
santhosh_2 = {3=>"hello", 4=>"world"}
priti = {3=>"hello", 4=>"world"}
koraktor = {3=>"c", 4=>"d"}
koraktor2 = {3=>"hello", 4=>"world"}
koraktor3 = {3=>"hello", 4=>"world"}
                  user     system      total        real
tinman        2.380000   0.000000   2.380000 (  2.382392)
santhosh_2    0.970000   0.020000   0.990000 (  0.976672)
santhosh      1.070000   0.010000   1.080000 (  1.078397)
priti         1.320000   0.000000   1.320000 (  1.318652)
koraktor      0.480000   0.000000   0.480000 (  0.488613)
koraktor2     0.490000   0.000000   0.490000 (  0.490099)
koraktor3     0.390000   0.000000   0.390000 (  0.389386)

2.0.0
N= 1000000
tinman = {3=>"hello", 4=>"world"}
santhosh = {3=>"hello", 4=>"world"}
santhosh_2 = {3=>"hello", 4=>"world"}
priti = {3=>"hello", 4=>"world"}
koraktor = {3=>"c", 4=>"d"}
koraktor2 = {3=>"hello", 4=>"world"}
koraktor3 = {3=>"hello", 4=>"world"}
                  user     system      total        real
tinman        1.840000   0.000000   1.840000 (  1.832491)
santhosh_2    0.720000   0.010000   0.730000 (  0.737737)
santhosh      0.780000   0.020000   0.800000 (  0.801619)
priti         1.040000   0.010000   1.050000 (  1.044588)
koraktor      0.390000   0.000000   0.390000 (  0.387265)
koraktor2     0.390000   0.000000   0.390000 (  0.388648)
koraktor3     0.320000   0.000000   0.320000 (  0.327859)
like image 37
the Tin Man Avatar answered Nov 14 '22 07:11

the Tin Man


hash2.select {|key, value| hash1.has_key? key }
# => {3=>"hello", 4=>"world"}
like image 11
Santhosh Avatar answered Nov 14 '22 05:11

Santhosh


Ruby 2.5 has added Hash#slice, which allows a compact code like:

hash3 = hash1.slice(*hash2.keys)

In older rubies this was possible in rails or projects using active support's hash extensions.

like image 10
artm Avatar answered Nov 14 '22 05:11

artm