Question

I am working with an array of roughly 2000 elements in C++.

Each element represents the probability of that element being selected randomly.

I then have convert this array into a cumulative array, with the intention of using this to work out which element to choose when a dice is rolled.

Example array: {1,2,3,4,5}

Example cumulative array: {1,3,6,10,15}

I want to be able to select 3 in the cumulative array when numbers 3, 4 or 5 are rolled.

The added complexity is that my array is made up of long doubles. Here's an example of a few consecutive elements:

0.96930161525189592646367317541056252139242133125662803649902343750 0.96941377254127855667142910078837303444743156433105468750000000000 0.96944321382974149711383993199831365927821025252342224121093750000 0.96946143938926617454089618153290075497352518141269683837890625000 0.96950069444055009509463721739663810694764833897352218627929687500 0.96951751803395748961766908990966840065084397792816162109375000000

This could be a terrible way of doing weighted probabilities with this data set, so I'm open to any suggestions of better ways of working this out.

Was it helpful?

Solution

You can use partial_sum:

unsigned int SIZE = 5;
int array[SIZE] = {1,2,3,4,5};
int partials[SIZE] = {0};

partial_sum(array, array+SIZE, partials);
// partials is now {1,3,6,10,15}

The value you want from the array is available from the partial sums:

12 == array[2] + array[3] + array[4];

12 == partials[4] - partials[1];

The total is obviously the last value in the partial sums:

15 == partial[4];

OTHER TIPS

consider storing the information as an integer numerator and denominator so that there is no loss of precision until the final step.

You can actually do this using stream selection without having to compute an array of partial sums. Here's code I have for this in Java:

public static int selectRandomWeighted(double[] wts, Random rnd) {
    int selected = 0;
    double total = wts[0];

    for( int i = 1; i < wts.length; i++ ) {
        total += wts[i];

        if( rnd.nextDouble() <= (wts[i] / total)) {
            selected = i;
        }
    }

    return selected;        
}

The above could potentially be further improved using Kahan summation if you want to preserve as many digits of accuracy in the sum as possible.

However, if you want to draw from this array repeatedly, then pre-computing an array of partial sums and using binary search to find the right index will be faster.

Ok I think I've solved this one.

I just did a binary split search, but instead of just having

if (arr[middle] == value)

I added in an OR

if (arr[middle] == value || (arr[middle] < value && arr[middle+1] > value))

This seems to handle it in the way I was hoping for.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top