Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to flatten a hash, making each key a unique value?

Tags:

flatten

ruby

hash

I want to take a hash with nested hashes and arrays and flatten it out into a single hash with unique values. I keep trying to approach this from different angles, but then I make it way more complex than it needs to be and get myself lost in what's happening.

Example Source Hash:

{
  "Name" => "Kim Kones",
  "License Number" => "54321",
  "Details" => {
    "Name" => "Kones, Kim",
    "Licenses" => [
      {
        "License Type" => "PT",
        "License Number" => "54321"
      },
      {
        "License Type" => "Temp",
        "License Number" => "T123"
      },
      {
        "License Type" => "AP",
        "License Number" => "A666",
        "Expiration Date" => "12/31/2020"
      }
    ]
  }
}

Example Desired Hash:

{
  "Name" => "Kim Kones",
  "License Number" => "54321",
  "Details_Name" => "Kones, Kim",
  "Details_Licenses_1_License Type" => "PT",
  "Details_Licenses_1_License Number" => "54321",
  "Details_Licenses_2_License Type" => "Temp",
  "Details_Licenses_2_License Number" => "T123",
  "Details_Licenses_3_License Type" => "AP",
  "Details_Licenses_3_License Number" => "A666",
  "Details_Licenses_3_Expiration Date" => "12/31/2020"
}

For what it's worth, here's my most recent attempt before giving up.

def flattify(hashy)
    temp = {}
    hashy.each do |key, val|
        if val.is_a? String
            temp["#{key}"] = val
        elsif val.is_a? Hash
            temp.merge(rename val, key, "")
        elsif val.is_a? Array
            temp["#{key}"] = enumerate val, key
        else
        end
        print "=> #{temp}\n"
    end
    return temp
end

def rename (hashy, str, n)
    temp = {}
    hashy.each do |key, val|
        if val.is_a? String
            temp["#{key}#{n}"] = val
        elsif val.is_a? Hash
            val.each do |k, v|
                temp["#{key}_#{k}#{n}"] = v
            end
        elsif val.is_a? Array
            temp["#{key}"] = enumerate val, key
        else

        end
    end
    return flattify temp
end

def enumerate (ary, str)
    temp = {}
    i = 1
    ary.each do |x|
        temp["#{str}#{i}"] = x
        i += 1
    end
    return flattify temp
end
like image 434
wilcro Avatar asked Feb 16 '18 23:02

wilcro


People also ask

What is a key-value hash?

A hash table is a type of data structure that stores key-value pairs. The key is sent to a hash function that performs arithmetic operations on it. The result (commonly called the hash value or hash) is the index of the key-value pair in the hash table.

How can you tell if a key is present in a hash?

We can check if a particular hash contains a particular key by using the method has_key?(key) . It returns true or false depending on whether the key exists in the hash or not.

What is the use of flatten() method in hash?

Hash#flatten () is a Hash class method which returns the one-dimensional flattening hash array. Hash a flatten form : [:a, 100, :b, 200] Hash b flatten form : [:a, 100, :c, [300, 45], :b, 200] Hash c flatten form : [:a, 100]

How to use the reverse function on a hash?

Using reverse If in our hash the values are unique, that is each value is different, then we can just pass the hash to he reverse function and assign the result to another hash. We will get a new hash in which the former values will be the key, and the former key will be values:

How to exchange keys with values of hash without losing data?

How to exchange keys with values of a hash without losing any data even if there a are duplicate values in the original hash? If in our hash the values are unique, that is each value is different, then we can just pass the hash to he reverse function and assign the result to another hash.

Why can't we reverse a hash in Perl?

The reason is that the keys of a hash must be unique. As we reverse the hash we have two keys (former values) that are identical. As Perl works through all the key-value pairs the last value of each key wins. Which one is the last can change from execution-to-execution.


Video Answer


2 Answers

Interesting question!

Theory

Here's a recursive method to parse your data.

  • It keeps track of which keys and indices it has found.
  • It appends them in a tmp array.
  • Once a leaf object has been found, it gets written in a hash as value, with a joined tmp as key.
  • This small hash then gets recursively merged back to the main hash.

Code

def recursive_parsing(object, tmp = [])
  case object
  when Array
    object.each.with_index(1).with_object({}) do |(element, i), result|
      result.merge! recursive_parsing(element, tmp + [i])
    end
  when Hash
    object.each_with_object({}) do |(key, value), result|
      result.merge! recursive_parsing(value, tmp + [key])
    end
  else
    { tmp.join('_') => object }
  end
end

As an example:

require 'pp'
pp recursive_parsing(data)
# {"Name"=>"Kim Kones",
#  "License Number"=>"54321",
#  "Details_Name"=>"Kones, Kim",
#  "Details_Licenses_1_License Type"=>"PT",
#  "Details_Licenses_1_License Number"=>"54321",
#  "Details_Licenses_2_License Type"=>"Temp",
#  "Details_Licenses_2_License Number"=>"T123",
#  "Details_Licenses_3_License Type"=>"AP",
#  "Details_Licenses_3_License Number"=>"A666",
#  "Details_Licenses_3_Expiration Date"=>"12/31/2020"}

Debugging

Here's a modified version with old-school debugging. It might help you understand what's going on:

def recursive_parsing(object, tmp = [], indent="")
  puts "#{indent}Parsing #{object.inspect}, with tmp=#{tmp.inspect}"
  result = case object
  when Array
    puts "#{indent} It's an array! Let's parse every element:"
    object.each_with_object({}).with_index(1) do |(element, result), i|
      result.merge! recursive_parsing(element, tmp + [i], indent + "  ")
    end
  when Hash
    puts "#{indent} It's a hash! Let's parse every key,value pair:"
    object.each_with_object({}) do |(key, value), result|
      result.merge! recursive_parsing(value, tmp + [key], indent + "  ")
    end
  else
    puts "#{indent} It's a leaf! Let's return a hash"
    { tmp.join('_') => object }
  end
  puts "#{indent} Returning #{result.inspect}\n"
  result
end

When called with recursive_parsing([{a: 'foo', b: 'bar'}, {c: 'baz'}]), it displays:

Parsing [{:a=>"foo", :b=>"bar"}, {:c=>"baz"}], with tmp=[]
 It's an array! Let's parse every element:
  Parsing {:a=>"foo", :b=>"bar"}, with tmp=[1]
   It's a hash! Let's parse every key,value pair:
    Parsing "foo", with tmp=[1, :a]
     It's a leaf! Let's return a hash
     Returning {"1_a"=>"foo"}
    Parsing "bar", with tmp=[1, :b]
     It's a leaf! Let's return a hash
     Returning {"1_b"=>"bar"}
   Returning {"1_a"=>"foo", "1_b"=>"bar"}
  Parsing {:c=>"baz"}, with tmp=[2]
   It's a hash! Let's parse every key,value pair:
    Parsing "baz", with tmp=[2, :c]
     It's a leaf! Let's return a hash
     Returning {"2_c"=>"baz"}
   Returning {"2_c"=>"baz"}
 Returning {"1_a"=>"foo", "1_b"=>"bar", "2_c"=>"baz"}
like image 70
Eric Duminil Avatar answered Oct 20 '22 01:10

Eric Duminil


Unlike the others, I have no love for each_with_object :-). But I do like passing a single result hash around so I don't have to merge and remerge hashes over and over again.

def flattify(value, result = {}, path = [])
  case value
  when Array
    value.each.with_index(1) do |v, i|
      flattify(v, result, path + [i])
    end
  when Hash
    value.each do |k, v|
      flattify(v, result, path + [k])
    end
  else
    result[path.join("_")] = value
  end
  result
end

(Some details adopted from Eric, see comments)

like image 22
Stefan Pochmann Avatar answered Oct 19 '22 23:10

Stefan Pochmann