You gave two options for what you want to do:
- Get a list of the values which were dropped in the conversion
- Make the keys unique by adding a special character to the key
I think the second approach is a bad idea, for a couple of reasons: a) you would have to have a method of modifying the key that would allow for the possibility of their being multiple duplicates; and b) making connections between the original and the duplicates would be awkward. Also, it would be just plain ugly.
I see others have suggested a third possibility: changing the form of the resulting hash, so that values arrays of strings. That might serve you well, but it is not what you asked for, so I chose to build a list of the values that are dropped; i.e., all but the first.
Code
def create_hash_and_save_extras(arr)
arr.each_slice(2).with_object([{},[]]) { |(k,v),(h,ex)|
h.update({k=>v}) { |k, ov, nv| ex << {k=>nv}; ov } }
end
Example
create_hash_and_save_extras(arr)
#=> [{"19d97e408ee3f993745b053e281ac9dc69519e06"=>"refs/heads/auto",
# "8f6f47c6e8023540b022586e368c68e1e814ce6d"=>"refs/heads/callout_hooks",
# "3cbdb4b2fcb85bc7f0ed08b62e2bf2445a7659e8"=>"refs/heads/elab",
# "d38a9a26ef887c08b306bdab210b39882f58e587"=>"refs/heads/elab_6.1",
# "906dfe6eebff832baf0f92683d751432fcc98ab7"=>"refs/heads/regression"},
# [{"19d97e408ee3f993745b053e281ac9dc69519e06"=>"refs/heads/master"}]]
Explanation
Enumerable#each_slice sent to arr
returns an enumerator:
enum1 = arr.each_slice(2)
#=> #<Enumerator: [
# "19d97e408ee3f993745b053e281ac9dc69519e06", "refs/heads/auto",
# "8f6f47c6e8023540b022586e368c68e1e814ce6d", "refs/heads/callout_hooks",
# ...
# "906dfe6eebff832baf0f92683d751432fcc98ab7", "refs/heads/regression"
# ]:each_slice(2)>
Enumerator#with_object creates an array consisting of and initially-empty hash (represented by the block variable h
) and an initially-empty array (for the "extras"), represented by the block variable ex
, which is then sent to enum1
to create another enumerator (which you can think of as a "compound enumerator"--note the reference to each_slice(2)>:with_object({})
below).
enum2 = enum1.with_object([{},[]])
#=> #<Enumerator: #<Enumerator: [
# "19d97e408ee3f993745b053e281ac9dc69519e06", "refs/heads/auto",
# "8f6f47c6e8023540b022586e368c68e1e814ce6d", "refs/heads/callout_hooks",
# ...
# "906dfe6eebff832baf0f92683d751432fcc98ab7", "refs/heads/regression"
# ]:each_slice(2)>:with_object([{},[])>
We can convert enum2
to an array to see what it will be passing into its block:
enum2.to_a
#=> [[["19d97e408ee3f993745b053e281ac9dc69519e06", "refs/heads/auto"],
# [{}, []]],
# [["8f6f47c6e8023540b022586e368c68e1e814ce6d", "refs/heads/callout_hooks"],
# [{}, []]],
# [["3cbdb4b2fcb85bc7f0ed08b62e2bf2445a7659e8", "refs/heads/elab"],
# [{}, []]],
# [["d38a9a26ef887c08b306bdab210b39882f58e587", "refs/heads/elab_6.1"],
# [{}, []]],
# [["19d97e408ee3f993745b053e281ac9dc69519e06", "refs/heads/master"],
# [{}, []]],
# [["906dfe6eebff832baf0f92683d751432fcc98ab7", "refs/heads/regression"],
# [{}, []]],
The first element that enum2
passes into its block is
[["19d97e408ee3f993745b053e281ac9dc69519e06", "refs/heads/auto"], [{}, []]]]]
The block variables are therefore assigned as follows:
k => "19d97e408ee3f993745b053e281ac9dc69519e06"
v => "refs/heads/auto"
h => {}
ex = []
We now use Hash#update (aka Hash#merge!
) to merge {k,v}
into h
(h
initially being empty.) Therefore
h.update({k=>v}) { |k, ov, nv| extras << {k=>nv}; ov }
becomes
h.update({"19d97e408ee3f993745b053e281ac9dc69519e06"=>"refs/heads/auto"})
followed by the block
{ |k, ov, nv| ex << {k=>nv}; ov }
but the block only applies when the hash merged hash (h
) and the hash being merged (update
's argument) share the same key k
, in which case ov
and nv
are the values associated with those keys for h
and the hash being merged, respectively. The merged value for key k
will be whatever is returned by the block. Yes, that will apply when we encounter duplicates.
So now
h #=> {"19d97e408ee3f993745b053e281ac9dc69519e06"=>"refs/heads/auto"}
We continue in this way for each of the other elements of enum2
. When we encounter
k = "19d97e408ee3f993745b053e281ac9dc69519e06"
v = "refs/heads/master"
h = {"19d97e408ee3f993745b053e281ac9dc69519e06"=>"refs/heads/auto",
"8f6f47c6e8023540b022586e368c68e1e814ce6d"=>"refs/heads/callout_hooks",
"3cbdb4b2fcb85bc7f0ed08b62e2bf2445a7659e8"=>"refs/heads/elab",
"d38a9a26ef887c08b306bdab210b39882f58e587"=>"refs/heads/elab_6.1"}
we find that k
is already in the merged hash h
, so the block is evaluated to determine the value of k
in the merged hash h
. We want to keep the current value h[k]
, which is ov
, so that is what the block returns. First, however, we append the (still empty) array ex
with the duplicate value, expressed as a hash.
ex << {"19d97e408ee3f993745b053e281ac9dc69519e06" => "refs/heads/master"}