Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby Hash Interaction With Pushing Onto Array

Tags:

ruby

hash

So let's say I do the following:

lph = Hash.new([])       #=> {}
lph["passed"] << "LCEOT" #=> ["LCEOT"]
lph                      #=> {} <-- Expected that to have been {"passed" => ["LCEOT"]}
lph["passed"]            #=> ["LCEOT"]
lph["passed"] = lph["passed"] << "HJKL"
lph #=> {"passed"=>["LCEOT", "HJKL"]}

I'm surprised by this. A couple questions:

  1. Why does it not get set until I push the second string on to the array? What is happening in the background?
  2. What is the more idiomatic ruby way to essentially say. I have a hash, a key, and a value I want to to end up in the array associated with the key. How do I push the value in an array associated with a key into a hash the first time. In all future uses of the key, I just want to addd to the array.
like image 808
Noah Clark Avatar asked Jan 10 '23 21:01

Noah Clark


2 Answers

Read the Ruby Hash.new documentation carefully - "if this hash is subsequently accessed by a key that doesn’t correspond to a hash entry, the value returned depends on the style of new used to create the hash".

new(obj) → new_hash

...If obj is specified, this single object will be used for all default values.

In your example you attempt to push something onto the value associated with a key which does not exist, so you end up mutating the same anonymous array you used to construct the hash initially.

the_array = []
h = Hash.new(the_array)
h['foo'] << 1 # => [1]
# Since the key 'foo' was not found
# ... the default value (the_array) is returned
# ... and 1 is pushed onto it (hence [1]).
the_array # => [1]
h # {} since the key 'foo' still has no value.

You probably want to use the block form:

new { |hash, key| block } → new_hash

...If a block is specified, it will be called with the hash object and the key, and should return the default value. It is the block’s responsibility to store the value in the hash if required.

For example:

h = Hash.new { |hash, key| hash[key] = [] } # Assign a new array as default for missing keys.
h['foo'] << 1 # => [1]
h['foo'] << 2 # => [1, 2]
h['bar'] << 3 # => [3]
h # => { 'foo' => [1, 2], 'bar' => [3] }
like image 87
maerics Avatar answered Jan 13 '23 10:01

maerics


Why does it not get set until I push the second string on to the array?

In short; because you don't set anything in the hash until the point, where you also add the second string to the array.

What is happening in the background?

To see what's happening in the background, let's take this one line at a time:

lph = Hash.new([])       #=> {}

This creates an empty hash, configured to return the [] object whenever a non-existing key is accessed.

lph["passed"] << "LCEOT" #=> ["LCEOT"]

This can be written as

value = lph["passed"] #=> []
value << "LCEOT"      #=> ["LCEOT"]

We see that lph["passed"] returns [] as expected, and we then proceed to append "LCEOT" to [].

lph                  #=> {}

lph is still an empty Hash. At no point have we added anything to the Hash. We have added something to its default value, but that doesn't change lph itself.

lph["passed"]        #=> ["LCEOT"]

This is where it gets interesting. Remember above when we did value << ["LCEOT"]. That actually changed the default value that lph returns when a key isn't found. The default value is no longer [], but has become ["LCEOT"]. That new default value is returned here.

lph["passed"] = lph["passed"] << "HJKL"

This is our first change to lph. And what we actually assign to lph["passed"] is the default value (because "passed" is still a non-existing key in lph) with "HJKL" appended. Before this, the default value was ["LCEOT"], after this it is ["LCEOT", "HJKL"].

In other words lph["passed"] << "HJKL" returns ["LCEOT", "HJKL"] which is then assigned to lph["passed"].

What is the more idiomatic Ruby way

Using <<=:

>> lph = Hash.new { [] }
=> {}
>> lph["passed"] <<= "LCEOT"
=> ["LCEOT"]
>> lph
=> {"passed"=>["LCEOT"]}

Also note the change in how the Hash is initialized, using a block instead of a verbatim array. This ensures a new, blank array is created and returned whenever a new key is accessed, as opposed to the same array being used every time.

like image 26
Jakob S Avatar answered Jan 13 '23 11:01

Jakob S