Weighted random integers
문제
I want to assign weightings to a randomly generated number, with the weightings represented below.
0 | 1 | 2 | 3 | 4 | 5 | 6
─────────────────────────────────────────
X | X | X | X | X | X | X
X | X | X | X | X | X |
X | X | X | X | X | |
X | X | X | X | | |
X | X | X | | | |
X | X | | | | |
X | | | | | |
What's the most efficient way to do it?
해결책
@Kerrek's answer is good.
But if the histogram of weights is not all small integers, you need something more powerful:
Divide [0..1] into intervals sized with the weights. Here you need segments with relative size ratios 7:6:5:4:3:2:1. So the size of one interval unit is 1/(7+6+5+4+3+2+1)=1/28, and the sizes of the intervals are 7/28, 6/28, ... 1/28.
These comprise a probability distribution because they sum to 1.
Now find the cumulative distribution:
P x
7/28 => 0
13/28 => 1
18/28 => 2
22/28 => 3
25/28 => 4
27/28 => 5
28/28 => 6
Now generate a random r
number in [0..1] and look it up in this table by finding the smallest x
such that r <= P(x)
. This is the random value you want.
The table lookup can be done with binary search, which is a good idea when the histogram has many bins.
Note you are effectively constructing the inverse cumulative density function, so this is sometimes called the method of inverse transforms.
다른 팁
If your array is small, just pick a uniform random index into the following array:
int a[] = {0,0,0,0,0,0,0, 1,1,1,1,1,1, 2,2,2,2,2, 3,3,3,3, 4,4,4, 5,5, 6};
If you want to generate the distribution at runtime, use std::discrete_distribution
.
To get the distribution you want, first you basically add up the count of X's you wrote in there. You can do it like this (my C is super rusty, so treat this as pseudocode)
int num_cols = 7; // for your example
int max;
if (num_cols % 2 == 0) // even
{
max = (num_cols+1) * (num_cols/2);
}
else // odd
{
max = (num_cols+1) * (num_cols/2) + ((num_cols+1)/2);
}
Then you need to randomly select an integer between 1
and max
inclusive.
So if your random integer is r
the last step is to find which column holds the r'th X. Something like this should work:
for(int i=0;i<num_cols;i++)
{
r -= (num_cols-i);
if (r < 1) return i;
}