Question

Experimenting in irb with strings and noticed that when a variable referencing a string is used as a key value in a Hash, a new copy of the String is created rather than a reference to the original object, this isn't the case with an Array:

1.9.3-p448 :051 > a = 'str1' 
 => "str1"
1.9.3-p448 :052 > b = 'str2' 
 => "str2" 
1.9.3-p448 :053 > arr = [a,b] 
 => ["str1", "str2"] 
1.9.3-p448 :054 > arr[0].object_id == a.object_id 
 => true 
1.9.3-p448 :055 > hash = { a => b } 
 => {"str1"=>"str2"} 
1.9.3-p448 :056 > hash.keys[0].object_id == a.object_id 
 => false

I understand if I just stuck to symbols I wouldn't be asking this question.

What is the purpose for making a copy of the String? I understand that a string comparison would still work but surely an object_id comparison would be quicker?

Was it helpful?

Solution

From Hash.[]= documentation:

key should not have its value changed while it is in use as a key (an unfrozen String passed as a key will be duplicated and frozen).

Since by default, strings are not immutable in ruby, theoretically you can change them after you set them as keys in your hash. If you do that - your hash will become invalid, as it will not be able to find those keys properly.

Since string are ubiquitous and are often used by reference, this way Ruby protects its hashes from unexpected bugs, which are very hard to detect.

OTHER TIPS

Most of the usual kinds of keys are immutable: numbers, symbols, dates. Strings however are mutable, and as Uri Agassi writes, Ruby protects the hash from bugs. It does not do so for arrays used as keys, perhaps for performance reasons (possibly large arrays) or perhaps arrays are not commonly used as keys. Hashes normally compare by the result of the hash method which every object has. If you want it to compare by object_id then you can switch it on: hash.compare_by_identity.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top