Ruby Hash

One of the popular feature’s of Ruby’s Hash object is that you can specify a default value for entries in the hash. For example, a common use case is to count the frequency of objects in an array.

hash = Hash.new
%w( orange red blue orange ).each do |colour|
  hash[colour] ||= 0
  hash[colour] += 1
end

# hash => {"orange"=>2, "red"=>1, "blue"=>1} 

Using the Hash initializer we can specify the default value for the hash as 0, which simplifies our code:

hash = Hash.new(0)
%w( orange red blue orange ).each { |colour| hash[colour] += 1 }

# hash => {"orange"=>2, "red"=>1, "blue"=>1} 

Another common use case is to have every entry in the hash be an array. As an example I’m grouping colours by their starting letter.

hash = Hash.new
%w( brown orange black red blue ).each do |colour|
  key = colour[0]
  hash[key] ||= []
  hash[key] << colour
end

#hash => {"b"=>["brown", "black", "blue"], "o"=>["orange"], "r"=>["red"]} 

We can use the same trick as above to specify an empty array as the default value for the hash, with dire consequences:

hash = Hash.new([])
%w( brown orange black red blue ).each do |colour|
  key = colour[0]
  hash[key] << colour
end

# hash => {} 

# hash["b"] => ["brown", "orange", "black", "red", "blue"] 

# hash["x"] => ["brown", "orange", "black", "red", "blue"] 

This is obviously not what we wanted. Ruby is indeed setting the default value for the hash to an array, but it’s using the same array everywhere. So every modification we make will modify this one, original array.

Ruby also allows us to specify a block as the initializer, which will be called if the key does not already exist in the array.

hash = Hash.new { |hsh, key| hsh[key] = [] }
%w( brown orange black red blue ).each do |colour|
  key = colour[0]
  hash[key] << colour
end

# hash => {"b"=>["brown", "black", "blue"], "o"=>["orange"], "r"=>["red"]} 

Here is the relevant documentation for Hash#new:

Returns a new, empty hash. If this hash is subsequently accessed by a key that doesn’t correspond to a hash entry, the value returned depends on the style of new used to create the hash. In the first form, the access returns nil. If obj is specified, this single object will be used for all default values. If a block is specified, it will be called with the hash object and the key, and should return the default value. It is the block’s responsibility to store the value in the hash if required.

Happy coding.