How can this imperative code be rewritten to be more functional?

Question 1

Here are two more functional versions in Clojure and JavaScript, but the ideas here should work in any language that supports closures. Basically, we use recursion instead of iteration to accomplish the same thing, and instead of breaking in the middle we just return a value and stop recursing.

Original pseudo code:

R = (some random int); 
T = 0; 
for o in os 
    T = T + o.weight; 
    if T > R 
        return o;

Clojure version (objects are just treated as clojure maps):

(defn recursive-version
  [r objects]
  (loop [t 0
         others objects]
    (let [obj (first others)
          new_t (+ t (:weight obj))]
      (if (> new_t r)
        obj
        (recur new_t (rest others))))))

JavaScript version (using underscore for convenience). Be careful, because this could blow out the stack. This is conceptually the same as the clojure version.

var js_recursive_version = function(objects, r) {
    var main_helper = function(t, others) {
        var obj = _.first(others);
        var new_t = t + obj.weight;
        if (new_t > r) {
          return obj;
        } else {
          return main_helper(new_t, _.rest(others));
        }
    };


    return main_helper(0, objects);
};

Question 2

You can implement this with a fold (aka Array#reduce, or Underscore's _.reduce):

An SSCCE:

items = [
  {item: 'foo', weight: 50}
  {item: 'bar', weight: 35}
  {item: 'baz', weight: 15}
]

r = Math.random() * 100

{item} = items.reduce (memo, {item, weight}) ->
  if memo.sum > r
    memo
  else
    {item, sum: memo.sum + weight}
, {sum: 0}

console.log 'r:', r, 'item:', item

You can run it many times at coffeescript.org and see that the results make sense :)

That being said, i find the fold a bit contrived, as you have to remember both the selected item and the accumulated weight between iterations, and it doesn't short-circuit when the item is found.

Maybe a compromise solution between pure FP and the tedium of reimplementing a find algorithm can be considered (using _.find):

total = 0
{item} = _.find items, ({weight}) ->
  total += weight
  total > r

Runnable example.

I find (no pun intended) this algorithm much more accessible than the first one (and it should perform better, as it doesn't create intermediate objects, and it does short-circuiting).

Update/side-note: the second algorithm is not "pure" because the function passed to _.find is not referentially transparent (it has the side effect of modifying the external total variable), but the whole of the algorithm is referentially transparent. If you were to encapsulate it in a findItem = (items, r) -> function, the function will be pure and will always return the same output for the same input. That's a very important thing, because it means that you can get the benefits of FP while using some non-FP constructs (for performance, readability, or whatever reason) under the hoods :D

Question 3

I think the underlying task is randomly selecting 'events' (objects) from array os with a frequency defined by their respective weights. The approach is to map (i.e. search) a random number (with uniform distribution) onto the stairstep cumulative probability distribution function.

With positive weights, their cumulative sum is increasing from 0 to 1. The code you gave us simply searches starting at the 0 end. To maximize speed with repeated calls, pre calculate sums, and order the events so the largest weights are first.

It really doesn't matter whether you search with iteration (looping) or recursion. Recursion is nice in a language that tries to be 'purely functional' but doesn't help understanding the underlying mathematical problem. And it doesn't help you package the task into a clean function. The underscore functions are another way of packaging the iterations, but don't change the basic functionality. Only any and all exit early when the target is found.

For small os array this simple search is sufficient. But with a large array, a binary search will be faster. Looking in underscore I find that sortedIndex uses this strategy. From Lo-Dash (an underscore dropin), "Uses a binary search to determine the smallest index at which the value should be inserted into array in order to maintain the sort order of the sorted array"

The basic use of sortedIndex is:

os = [{name:'one',weight:.7},
      {name:'two',weight:.25},
      {name:'three',weight:.05}]
t=0; cumweights = (t+=o.weight for o in os)
i = _.sortedIndex(cumweights, R)
os[i]

You can hide the cumulative sum calculation with a nested function like:

osEventGen = (os)->
  t=0; xw = (t+=y.weight for y in os)
  return (R) ->
    i = __.sortedIndex(xw, R)
    return os[i]
osEvent = osEventGen(os)
osEvent(.3)
# { name: 'one', weight: 0.7 }
osEvent(.8)
# { name: 'two', weight: 0.25 }
osEvent(.99)
# { name: 'three', weight: 0.05 }

In coffeescript, Jed Clinger's recursive search could be written like this:

foo = (x, r, t=0)->
  [y, x...] = x
  t += y
  return [y, t] if x.length==0 or t>r
  return foo(x, r, t)

An loop version using the same basic idea is:

foo=(x,r)->
  t=0
  while x.length and t<=r
    [y,x...]=x  # the [first, rest] split
    t+=y
    y

Tests on jsPerf http://jsperf.com/sortedindex suggest that sortedIndex is faster when os.length is around 1000, but slower than the simple loop when the length is more like 30.