Domanda

Let's say that you have a hash where the keys are Strings, and the values are Floats. You want to group the values by a substring of each key, and then sum the values within each group.

Basically, you want to go from this:

{ "aaaapattern1aaaa" => 213.2342, "pattern2aaaa" => 0.03, 
  "aaaaapattern3" => 12.1, "pattern1aaa" => 54.4544, 
  "aaaaapattern2" => 65.003 }

to this:

{"pattern1"=>267.6886, "pattern2"=>65.033, "pattern3"=>12.1}

Here is my current approach:

data = {
  "aaaapattern1aaaa"=>213.2342, "pattern2aaaa"=>0.03, 
  "aaaaapattern3"=>12.1, "pattern1aaa"=>54.4544, 
  "aaaaapattern2"=>65.003
}

key_regexp = /pattern\d/

intermediate_results = data.map do |key, value| 
  { key.match(key_regexp)[0] => value } 
end

final_result = intermediate_results.reduce do |cumulative_hash, individual_hash| 
  cumulative_hash.merge(individual_hash) do |key, old_value, new_value| 
    old_value + new_value 
  end
end

How would you improve on this? What factors should be considered in formulating an ideal approach? Would your answer change based on the size of the Hash, and if so, how?

È stato utile?

Soluzione

That's a lot of work for what should be pretty simple:

sums = Hash.new(0)

d.each do |key, value|
  if (m = key.match(/pattern\d/))
    sums[m[0]] += value
  end
end

sums
# => {"pattern1"=>267.6886, "pattern2"=>65.033, "pattern3"=>12.1}

This has the advantage of ignoring anything that doesn't match.

Here Hash.new(0) creates a Hash that has a default value of 0. This is a good pattern to use for assembling sums of arbitrary things.

Altri suggerimenti

I think I would do something like the below. I can not say too much about the performance though.

sums = Hash.new(0) #Initialize hash with 0 as default
data.each do |k,v|
  case k #switch on the key
  when /pattern1/ #do regex pattern checks
   sums[:pattern_1] += v
  when /pattern2/
   sums[:pattern_2] += v
  else
    #undefined pattern
  end
end

If you use each_with_object you can make it rather compact.

Assuming data and key_regexp are as you defined:

data.each_with_object(Hash.new(0)) do |(k,v),r|
  r[k.match(key_regexp)[0]] += v
end
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top