Question

I have an array of hashes, and I want the unique values out of it. Calling Array.uniq doesn't give me what I expect.

a = [{:a => 1},{:a => 2}, {:a => 1}]
a.uniq # => [{:a => 1}, {:a => 2}, {:a => 1}]

Where I expected:

[{:a => 1}, {:a => 2}]

In searching around on the net, I didn't come up with a solution that I was happy with. Folks recommended redefining Hash.eql? and Hash.hash, since that is what Array.uniq is querying.

Edit: Where I ran into this in the real world, the hashes were slightly more complex. They were the result of parsed JSON that had multiple fields, some of which the values were hashes as well. I had an array of those results that I wanted to filter out the unique values.

I don't like the redefine Hash.eql? and Hash.hash solution, because I would either have to redefine Hash globally, or redefine it for each entry in my array. Changing the definition of Hash for each entry would be cumbersome, especially since there may be nested hashes inside of each entry.

Changing Hash globally has some potential, especially if it were done temporarily. I'd want to build another class or helper function that wrapped saving off the old definitions, and restoring them, but I think this adds more complexity than is really needed.

Using inject seems like a good alternative to redefining Hash.

Was it helpful?

Solution

I can get what I want by calling inject

a = [{:a => 1},{:a => 2}, {:a => 1}]
a.inject([]) { |result,h| result << h unless result.include?(h); result }

This will return:

[{:a=>1}, {:a=>2}]

OTHER TIPS

Ruby 1.8.7+ will return just what you have expected:

[{:a=>1}, {:a=>2}, {:a=>1}].uniq
#=> [{:a=>1}, {:a=>2}] 

I've had a similar situation, but hashes had keys. I used sorting method.

What I mean:

you have an array:

[{:x=>1},{:x=>2},{:x=>3},{:x=>2},{:x=>1}]

you sort it (#sort_by {|t| t[:x]}) and get this:

[{:x=>1}, {:x=>1}, {:x=>2}, {:x=>2}, {:x=>3}]

now a bit modified version of answer by Aaaron Hinni:

your_array.inject([]) do |result,item| 
  result << item if !result.last||result.last[:x]!=item[:x]
  result
end

I've also tried:

test.inject([]) {|r,h| r<<h unless r.find {|t| t[:x]==h[:x]}; r}.sort_by {|t| t[:x]}

but it's very slow. here is my benchmark:

test=[]
1000.times {test<<{:x=>rand}}

Benchmark.bmbm do |bm|
  bm.report("sorting: ") do
    test.sort_by {|t| t[:x]}.inject([]) {|r,h| r<<h if !r.last||r.last[:x]!=h[:x]; r}
  end
  bm.report("inject: ") {test.inject([]) {|r,h| r<<h unless r.find {|t| t[:x]==h[:x]}; r}.sort_by {|t| t[:x]} }
end

results:

Rehearsal ---------------------------------------------
sorting:    0.010000   0.000000   0.010000 (  0.005633)
inject:     0.470000   0.140000   0.610000 (  0.621973)
------------------------------------ total: 0.620000sec

                user     system      total        real
sorting:    0.010000   0.000000   0.010000 (  0.003839)
inject:     0.480000   0.130000   0.610000 (  0.612438)

Assuming your hashes are always single key-value pairs, this will work:

a.map {|h| h.to_a[0]}.uniq.map {|k,v| {k => v}}

Hash.to_a creates an array of key-value arrays, so the first map gets you:

[[:a, 1], [:a, 2], [:a, 1]]

uniq on Arrays does what you want, giving you:

[[:a, 1], [:a, 2]]

and then the second map puts them back together as hashes again.

You can use (tested in ruby 1.9.3),

[{a: 1},{a: 2},{a:1}].uniq => [{a:1},{a: 2}]
[{a: 1,b: 2},{a: 2, b: 2},{a: 1, b: 3}].uniq_by {|v| v[:a]} => [{a: 1,b: 2},{a: 2, b: 2}]

The answer you give is similar to the one discussed here. It overrides the hash and eql? methods on the hashes that are to appear in the array which then makes uniq behave correctly.

The pipe method on arrays (available since 1.8.6) performs set union (returning an array), so the following is another possible way to get unique elements of any array a:

[] | a

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top