Question

From what I understand, when you set an object = to another, it will always be a reference, so we have methods like .dup and .clone to actually create a copy of an object and not a reference.

However, I am duplicating or cloning an array of hashes and then when I delete a key from the original hash they are being deleted from the copies! This is not supposed to happen, I wonder what I'm doing wrong.

Code:

or_data = {title: 'some title', tracks: [ { name: 'track one', position: 0, 
  artist: 'orignal artist', composer: 'original composer', duration: '1:30' }, 
  { name: 'track two', position: 1, artist: 'some other guy', 
  composer: 'beethoven', duration: '2:10' } ]  }

new_hash = or_data.dup
# or new_hash = or_data.clone, either way produces the same result

or_data[:tracks].each { |e| e.delete(:position) }

The :position key will also be deleted from new_hash!

This happens regardless of whether I use .dup or .clone.

I just read a post that says one should use:

new_hash = Marshal.load( Marshal.dump(or_data) )

This does work. But why? Because .dup and .clone do "shallow copies" meaning they will create a reference to :tracks (in this example) instead of a copy because it is an array of hashes contained within a hash?

Was it helpful?

Solution

Have a look at the code below:

or_data = {title: 'some title', tracks: [ { name: 'track one', position: 0, artist: 'orignal artist', composer: 'original composer', duration: '1:30' }, { name: 'track two', position: 1, artist: 'some other guy', composer: 'beethoven', duration: '2:10' } ]  }
new_hash = or_data.dup

p "Using .dup"
p "-----------"
p "or_data : #{or_data.object_id}"
p "new_hash : #{new_hash.object_id}"

p "or_data[:tracks] :#{or_data[:tracks].object_id}"
p "new_hash[:tracks] : #{new_hash[:tracks].object_id}"


or_data[:tracks].each { |e| e.delete(:position) }


new_hash = Marshal.load( Marshal.dump(or_data) )

p "Marshalling"
p "-----------"
p "or_data : #{or_data.object_id}"
p "new_hash : #{new_hash.object_id}"

p "or_data[:tracks] :#{or_data[:tracks].object_id}"
p "new_hash[:tracks] : #{new_hash[:tracks].object_id}"

Output:

"Using .dup"
"-----------"
"or_data : 5282580"
"new_hash : 5282568"
"or_data[:tracks] :5282592"
"new_hash[:tracks] : 5282592"

"Marshalling"
"-----------"
"or_data : 5282580"
"new_hash : 5282172"
"or_data[:tracks] :5282592"
"new_hash[:tracks] : 5282112"

The reason the position key gets deleted when using .dup or .clone is because the tracks key still refers to the same array object. Where as after Marshalling the tracks key refers to an entire new array object.

OTHER TIPS

You are cloning Hash object not Hash values. It doesn't perform nested cloning.

In your case array ([ { name: 'track one', position: 0, artist: 'orignal artist', composer: 'original composer', duration: '1:30' }, { name: 'track two', position: 1, artist: 'some other guy', composer: 'beethoven', duration: '2:10' } ]) is referencing the same object on clone or dup

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top