Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Check if two hashes have the same set of keys

Tags:

ruby

hash

What is the most efficient way to check if two hashes h1 and h2 have the same set of keys disregarding the order? Could it be made faster or more concise with close efficiency than the answer that I post?

like image 881
sawa Avatar asked Dec 09 '12 10:12

sawa


People also ask

How to check if two hashes are equal?

Equality—Two hashes are equal if they each contain the same number of keys and if each key-value pair is equal to (according to Object#== ) the corresponding elements in the other hash. The orders of each hashes are not compared.

What is a hash in Ruby?

In Ruby, Hash is a collection of unique keys and their values. Hash is like an Array, except the indexing is done with the help of arbitrary keys of any object type. In Hash, the order of returning keys and their value by various iterators is arbitrary and will generally not be in the insertion order.


3 Answers

Combining freemasonjson's and sawa's ideas:

h1.size == h2.size and (h1.keys - h2.keys).empty?
like image 50
Jan Avatar answered Oct 14 '22 09:10

Jan


Try:

# Check that both hash have the same number of entries first before anything
if h1.size == h2.size
    # breaks from iteration and returns 'false' as soon as there is a mismatched key
    # otherwise returns true
    h1.keys.all?{ |key| !!h2[key] }
end

Enumerable#all?

worse case scenario, you'd only be iterating through the keys once.

like image 5
strider Avatar answered Oct 14 '22 09:10

strider


Just for the sake of having at least a benchmark on this question...

require 'securerandom'
require 'benchmark'

a = {}
b = {}

# Use uuid to get a unique random key
(0..1_000).each do |i|
  key = SecureRandom.uuid
  a[key] = i
  b[key] = i
end

Benchmark.bmbm do |x|
  x.report("#-") do
    1_000.times do
      (a.keys - b.keys).empty? and (a.keys - b.keys).empty?
    end
  end

  x.report("#&") do
    1_000.times do
      computed = a.keys & b.keys
      computed.size == a.size
    end
  end

  x.report("#all?") do
    1_000.times do
      a.keys.all?{ |key| !!b[key] }
    end
  end

  x.report("#sort") do
    1_000.times do
      a_sorted = a.keys.sort
      b_sorted = b.keys.sort
      a == b
    end
  end
end

Results are:

Rehearsal -----------------------------------------
#-      1.000000   0.000000   1.000000 (  1.001348)
#&      0.560000   0.000000   0.560000 (  0.563523)
#all?   0.240000   0.000000   0.240000 (  0.239058)
#sort   0.850000   0.010000   0.860000 (  0.854839)
-------------------------------- total: 2.660000sec

            user     system      total        real
#-      0.980000   0.000000   0.980000 (  0.976698)
#&      0.560000   0.000000   0.560000 (  0.559592)
#all?   0.250000   0.000000   0.250000 (  0.251128)
#sort   0.860000   0.000000   0.860000 (  0.862857)

I have to agree with @akuhn that this would be a better benchmark if we had more information on the dataset you are using. But that being said, I believe this question really needed some hard fact.

like image 4
Vincent B. Avatar answered Oct 14 '22 09:10

Vincent B.