Rejection sampling is primarily useful for continuous distributions. What you need is to sample a discrete distribution. Fortunately, this is part of STL in C++11. So, adapted from the sample of std::discrete_distribution:
#include <iostream>
#include <map>
#include <random>
template <typename T>
class sampler
{
std::vector<T> keys;
std::discrete_distribution<T> distr;
public:
sampler(const std::vector<T>& keys, const std::vector<float>& prob) :
keys(keys), distr(prob.begin(), prob.end()) { }
T operator()()
{
static std::random_device rd;
static std::mt19937 gen(rd());
return keys[distr(gen)];
}
};
int main()
{
using T = int;
sampler<T> samp({19, 54, 192, 732}, {.1, .2, .4, .3});
std::map<T, size_t> hist;
for (size_t n = 0; n < 10000; ++n)
++hist[samp()];
for (auto i: hist)
{
std::cout << i.first << " generated " <<
i.second << " times" << std::endl;
}
}
Output:
19 generated 1010 times
54 generated 2028 times
192 generated 3957 times
732 generated 3005 times
Vectors keys
and prob
contain separately the keys and values (probabilities) of your map. This is because std::discrete_distribution
takes into account only the probabilities.
Note that operator()
cannot be const
because std::discrete_distribution
changes state (naturally) at every sample.
Also note that even you implement sampling yourself using the cumulative distribution and binary search (whereby sampling is logarithmic-time in the size of your domain), there are more efficient (constant-time) sampling methods like the alias method. I am not sure what method is used by
std::discrete_distribution
, however.