I'm going through Ruby Koans, and I hit #41 which I believe is this:
def test_default_value_is_the_same_object hash = Hash.new([]) hash[:one] << "uno" hash[:two] << "dos" assert_equal ["uno","dos"], hash[:one] assert_equal ["uno","dos"], hash[:two] assert_equal ["uno","dos"], hash[:three] assert_equal true, hash[:one].object_id == hash[:two].object_id end
It could not understand the behavior so I Googled it and found Strange ruby behavior when using Hash default value, e.g. Hash.new([]) that answered the question nicely.
So I understand how that works, my question is, why does a default value such as an integer that gets incremented not get changed during use? For example:
puts "Text please: " text = gets.chomp words = text.split(" ") frequencies = Hash.new(0) words.each { |word| frequencies[word] += 1 }
This will take user input and count the number of times each word is used, it works because the default value of 0 is always used.
I have a feeling it has to do with the <<
operator but I'd love an explanation.
Another initialization method is to pass Hash. new a block, which is invoked each time a value is requested for a key that has no value. This allows you to use a distinct value for each key. The block is passed two arguments: the hash being asked for a value, and the key used.
Convert the key from a string to a symbol, and do a lookup in the hash. Rails uses this class called HashWithIndifferentAccess that proves to be very useful in such cases.
A Hash is a dictionary-like collection of unique keys and their values. Also called associative arrays, they are similar to Arrays, but where an Array uses integers as its index, a Hash allows you to use any object type. Hashes enumerate their values in the order that the corresponding keys were inserted.
The other answers seem to indicate that the difference in behavior is due to Integer
s being immutable and Array
s being mutable. But that is misleading. The difference is not that the creator of Ruby decided to make one immutable and the other mutable. The difference is that you, the programmer decided to mutate one but not the other.
The question is not whether Array
s are mutable, the question is whether you mutate it.
You can get both the behaviors you see above, just by using Array
s. Observe:
Array
with mutationhsh = Hash.new([]) hsh[:one] << 'one' hsh[:two] << 'two' hsh[:nonexistent] # => ['one', 'two'] # Because we mutated the default value, nonexistent keys return the changed value hsh # => {} # But we never mutated the hash itself, therefore it is still empty!
Array
without mutationhsh = Hash.new([]) hsh[:one] += ['one'] hsh[:two] += ['two'] # This is syntactic sugar for hsh[:two] = hsh[:two] + ['two'] hsh[:nonexistant] # => [] # We didn't mutate the default value, it is still an empty array hsh # => { :one => ['one'], :two => ['two'] } # This time, we *did* mutate the hash.
Array
every time with mutationhsh = Hash.new { [] } # This time, instead of a default *value*, we use a default *block* hsh[:one] << 'one' hsh[:two] << 'two' hsh[:nonexistent] # => [] # We *did* mutate the default value, but it was a fresh one every time. hsh # => {} # But we never mutated the hash itself, therefore it is still empty! hsh = Hash.new {|hsh, key| hsh[key] = [] } # This time, instead of a default *value*, we use a default *block* # And the block not only *returns* the default value, it also *assigns* it hsh[:one] << 'one' hsh[:two] << 'two' hsh[:nonexistent] # => [] # We *did* mutate the default value, but it was a fresh one every time. hsh # => { :one => ['one'], :two => ['two'], :nonexistent => [] }
It is because Array
in Ruby is mutable object, so you can change it internal state, but Fixnum
isn't mutable. So when you increment value using +=
internally it get that (assume that i
is our reference to Fixnum
object):
i
raw_tmp
)raw_tmp + 1
i
So as you can see, we created new object, and i
reference now to something different than at the beginning.
In the other hand, when we use Array#<<
it works that way:
arr
So as you can see it is much simpler, but it can cause some bugs. One of them you have in your question, another one is thread race when booth are trying simultaneously append 2 or more elements. Sometimes you can end with only some of them and with thrashes in memory, when you use +=
on arrays too, you will get rid of both of these problems (or at least minimise impact).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With