Question

I found the following code here for eliminating duplicate records in an array:

require 'set'

class Array
  def uniq_by
    seen = Set.new
    select{ |x| seen.add?( yield( x ) ) }
  end
end

And we can use the code above as follows:

@messages = Messages.all.uniq_by { |h| h.body }

I would like to know how and what happens when the method is called. Can someone explain the internals of the code above? In the uniq_by method, we did not do anything to handle block argument. How is the passed argument handled by uniq_by method?

Was it helpful?

Solution 3

In Ruby, when you are putting yield keyword inside any method(say #bar), you are explicitly telling #bar that, you will be using a block with the method #bar. So yield knows, inside the method block will be converted to a Proc object, and yield have to call that Proc object.

Example :

def bar
   yield
end

p bar { "hello" } # "hello" 
p bar # bar': no block given (yield) (LocalJumpError)

In the uniq_by method, we did not do anything to handle block argument. How is the passed argument handled by uniq_by method?

You did do, that is you put yield. Once you will put this yield, now method is very smart to know, what it supposed to so. In the line Messages.all.uniq_by { |h| h.body } you are passing a block { |h| h.body }, and inside the method definition of uniq_by, that block has been converted to a Proc object, and yield does Proc#call.

Proof:

def bar
   p block_given? # true
   yield
end

bar { "hello" } # "hello"

Better for understanding :

class Array
  def uniq_by
    seen = Set.new
    select{ |x| seen.add?( yield( x ) ) }
  end
end

is same as

class Array
  def uniq_by
    seen = Set.new
    # Below you are telling uniq_by, you will be using a block with it
    # by using `yield`.
    select{ |x| var = yield(x); seen.add?(var) }
  end
end

Read the doc of yield

Called from inside a method body, yields control to the code block (if any) supplied as part of the method call. If no code block has been supplied, calling yield raises an exception. yield can take an argument; any values thus yielded are bound to the block's parameters. The value of a call to yield is the value of the executed code block.

OTHER TIPS

Let's break it down :

seen = Set.new

Create an empty set

select{ |x| seen.add?( yield( x ) ) }

Array#select will keep elements when the block yields true.

seen.add?(yield(x)) will return true if the result of the block can be added in the set, or false if it can't.

Indeed, yield(x) will call the block passed to the uniq_by method, and pass x as an argument.

In our case, since our block is { |h| h.body }, it would be the same as calling seen.add?(x.body)

Since a set is unique, calling add? when the element already exists will return false.

So it will try to call .body on each element of the array and add it in a set, keeping elements where the adding was possible.

The method uniq_by accepts a block argument. This allows to specify, by what criteria you wish to identify two elements as "unique".

The yield statement will evaluate the value of the given block for the element and return the value of the elements body attribute. So, if you call unique_by like above, you are stating that the attribute body of the elements has to be unique for the element to be unique.

To answer the more specific question you have: yield will call the passed block {|h| h.body} like a method, substituting h for the current x and therefore return x.body

Array#select returns a new array containing all elements of the array for which the given block returns a true value.

The block argument of the select use Set#add? to determine whether the element is already there. add? returns nil if there is already the same element in the set, otherwise it returns the set itself and add the element to the set.

The block again pass the argument (an element of the array) to another block (the block passed to the uniq_by) using yield; Return value of the yield is return value of the block ({|h| h.body })

The select .. statement is basically similar to following statement:

select{ |x| seen.add?(x.body) }

But by using yield, the code avoid hard-coding of .body, and defers decision to the block.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top