Question

I have an array of non-negative values. I want to build an array of values who's sum is 20 so that they are proportional to the first array.

This would be an easy problem, except that I want the proportional array to sum to exactly 20, compensating for any rounding error.

For example, the array

input = [400, 400, 0, 0, 100, 50, 50]

would yield

output = [8, 8, 0, 0, 2, 1, 1]
sum(output) = 20

However, most cases are going to have a lot of rounding errors, like

input = [3, 3, 3, 3, 3, 3, 18]

naively yields

output = [1, 1, 1, 1, 1, 1, 10]
sum(output) = 16  (ouch)

Is there a good way to apportion the output array so that it adds up to 20 every time?

Was it helpful?

Solution

There's a very simple answer to this question: I've done it many times. After each assignment into the new array, you reduce the values you're working with as follows:

  1. Call the first array A, and the new, proportional array B (which starts out empty).
  2. Call the sum of A elements T
  3. Call the desired sum S.
  4. For each element of the array (i) do the following:
    a. B[i] = round(A[i] / T * S). (rounding to nearest integer, penny or whatever is required)
    b. T = T - A[i]
    c. S = S - B[i]

That's it! Easy to implement in any programming language or in a spreadsheet.

The solution is optimal in that the resulting array's elements will never be more than 1 away from their ideal, non-rounded values. Let's demonstrate with your example:
T = 36, S = 20. B[1] = round(A[1] / T * S) = 2. (ideally, 1.666....)
T = 33, S = 18. B[2] = round(A[2] / T * S) = 2. (ideally, 1.666....)
T = 30, S = 16. B[3] = round(A[3] / T * S) = 2. (ideally, 1.666....)
T = 27, S = 14. B[4] = round(A[4] / T * S) = 2. (ideally, 1.666....)
T = 24, S = 12. B[5] = round(A[5] / T * S) = 2. (ideally, 1.666....)
T = 21, S = 10. B[6] = round(A[6] / T * S) = 1. (ideally, 1.666....)
T = 18, S = 9.   B[7] = round(A[7] / T * S) = 9. (ideally, 10)

Notice that comparing every value in B with it's ideal value in parentheses, the difference is never more than 1.

It's also interesting to note that rearranging the elements in the array can result in different corresponding values in the resulting array. I've found that arranging the elements in ascending order is best, because it results in the smallest average percentage difference between actual and ideal.

OTHER TIPS

Your problem is similar to a proportional representation where you want to share N seats (in your case 20) among parties proportionnaly to the votes they obtain, in your case [3, 3, 3, 3, 3, 3, 18]

There are several methods used in different countries to handle the rounding problem. My code below uses the Hagenbach-Bischoff quota method used in Switzerland, which basically allocates the seats remaining after an integer division by (N+1) to parties which have the highest remainder:

def proportional(nseats,votes):
    """assign n seats proportionaly to votes using Hagenbach-Bischoff quota
    :param nseats: int number of seats to assign
    :param votes: iterable of int or float weighting each party
    :result: list of ints seats allocated to each party
    """
    quota=sum(votes)/(1.+nseats) #force float
    frac=[vote/quota for vote in votes]
    res=[int(f) for f in frac]
    n=nseats-sum(res) #number of seats remaining to allocate
    if n==0: return res #done
    if n<0: return [min(x,nseats) for x in res] # see siamii's comment
    #give the remaining seats to the n parties with the largest remainder
    remainders=[ai-bi for ai,bi in zip(frac,res)]
    limit=sorted(remainders,reverse=True)[n-1]
    #n parties with remainter larger than limit get an extra seat
    for i,r in enumerate(remainders):
        if r>=limit:
            res[i]+=1
            n-=1 # attempt to handle perfect equality
            if n==0: return res #done
    raise #should never happen

However this method doesn't always give the same number of seats to parties with perfect equality as in your case:

proportional(20,[3, 3, 3, 3, 3, 3, 18])
[2,2,2,2,1,1,10]

You have set 3 incompatible requirements. An integer-valued array proportional to [1,1,1] cannot be made to sum to exactly 20. You must choose to break one of the "sum to exactly 20", "proportional to input", and "integer values" requirements.

If you choose to break the requirement for integer values, then use floating point or rational numbers. If you choose to break the exact sum requirement, then you've already solved the problem. Choosing to break proportionality is a little trickier. One approach you might take is to figure out how far off your sum is, and then distribute corrections randomly through the output array. For example, if your input is:

[1, 1, 1]

then you could first make it sum as well as possible while still being proportional:

[7, 7, 7]

and since 20 - (7+7+7) = -1, choose one element to decrement at random:

[7, 6, 7]

If the error was 4, you would choose four elements to increment.

A naïve solution that doesn't perform well, but will provide the right result...

Write an iterator that given an array with eight integers (candidate) and the input array, output the index of the element that is farthest away from being proportional to the others (pseudocode):

function next_index(candidate, input)
    // Calculate weights
    for i in 1 .. 8
        w[i] = candidate[i] / input[i]
    end for
    // find the smallest weight
    min = 0
    min_index = 0
    for i in 1 .. 8
        if w[i] < min then
            min = w[i]
            min_index = i
        end if
    end for

    return min_index
 end function

Then just do this

result = [0, 0, 0, 0, 0, 0, 0, 0]
result[next_index(result, input)]++ for 1 .. 20

If there is no optimal solution, it'll skew towards the beginning of the array.

Using the approach above, you can reduce the number of iterations by rounding down (as you did in your example) and then just use the approach above to add what has been left out due to rounding errors:

result = <<approach using rounding down>>
while sum(result) < 20
    result[next_index(result, input)]++

So the answers and comments above were helpful... particularly the decreasing sum comment from @Frederik.

The solution I came up with takes advantage of the fact that for an input array v, sum(v_i * 20) is divisible by sum(v). So for each value in v, I mulitply by 20 and divide by the sum. I keep the quotient, and accumulate the remainder. Whenever the accumulator is greater than sum(v), I add one to the value. That way I'm guaranteed that all the remainders get rolled into the results.

Is that legible? Here's the implementation in Python:

def proportion(values, total):
    # set up by getting the sum of the values and starting
    # with an empty result list and accumulator
    sum_values = sum(values)
    new_values = []
    acc = 0

    for v in values:
        # for each value, find quotient and remainder
        q, r = divmod(v * total, sum_values)

        if acc + r < sum_values:
            # if the accumlator plus remainder is too small, just add and move on
            acc += r
        else:
            # we've accumulated enough to go over sum(values), so add 1 to result
            if acc > r:
                # add to previous
                new_values[-1] += 1
            else:
                # add to current
                q += 1
            acc -= sum_values - r

        # save the new value
        new_values.append(q)

    # accumulator is guaranteed to be zero at the end
    print new_values, sum_values, acc

    return new_values

(I added an enhancement that if the accumulator > remainder, I increment the previous value instead of the current value)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top