The reduce function is called many times. The map function will be run on about 1/3 of the vnodes in the cluster (that's 22 times in a cluster with ring_size 64), the reduce function will be called each time results are available from a map function, with it's first argument being a list containing both the result from the previous run of the reduce function, and the results from the map function. In your case, you counted the values returned from the first vnode, which was then passed as a value included with the second vnode's results, and only counted as a single value.
What you will need to do is have the reduce function return a value/object that is easily differentiated from the other values, such as
function(o) {
var prevCount = 0;
var countObjects = 0;
for each (e in o) {
if (typeof e === 'object' && typeof e.reduce_running_total === 'number') {
prevCount += e.reduce_running_total;
countObjects += 1;
}
}
return([{"reduce_running_total":o.length + prevCount - countObjects}]);
}
Or, you could save some network bandwidth, and instead of having the map phase return all of the objects, have the map function return a literal [1]
for each key found, then the reduce function simply sums up all the numbers in the input list and returns them.