Question

I've written a simulation in C++ that generates (1,000,000)^2 numbers from a specific probability distribution and then does something with them. So far I've used Exponential, Normal, Gamma, Uniform and Poisson distributions. Here is the code for one of them:

#include <boost/random.hpp>

...main...

    srand(time(NULL)) ;
    seed = rand();
    boost::random::mt19937 igen(seed) ;
    boost::random::variate_generator<boost::random::mt19937, boost::random::normal_distribution<> >
    norm_dist(igen, boost::random::normal_distribution<>(mu,sigma)) ;

Now I need to run it for the Beta distribution. All of the distributions I've done so far took 10-15 hours. The Beta distribution is not in the boost/random package so I had to use the boost/math/distributions package. I found this page on StackOverflow which proposed a solution. Here it is (copy-pasted):

#include <boost/math/distributions.hpp> 
using namespace boost::math;  
double alpha, beta, randFromUnif;  
//parameters and the random value on (0,1) you drew  
beta_distribution<> dist(alpha, beta); 
double randFromDist = quantile(dist, randFromUnif); 

I replicated it and it worked. The run time estimates of my simulation are linear and accurately predictable. They say that this will run for 25 days. I see two possibilities: 1. the method proposed is inferior to the one I was using previously for other distributions 2. the Beta distribution is just much harder to generate random numbers from

Bare in mind that I have below minimal understanding of C++ coding, so the questions I'm asking may be silly. I can't wait for a month for this simulation to complete, so is there anything I can do to improve that? Perhaps use the initial method that I was using and modify it to work with the boost/math/distributions package? I don't even know if that's possible.

Another piece of information that may be useful is that the parameters are the same for all (1,000,000)^2 of the numbers that I need to generate. I'm saying this because the Beta distribution does have a nasty PDF and perhaps the knowledge that the parameters are fixed can somehow be used to simplify the process? Just a random guess.

Was it helpful?

Solution

The beta distribution is related to the gamma distribution. Let X be a random number drawn from Gamma(α,1) and Y from Gamma(β,1), where the first argument to the gamma distribution is the shape parameter. Then Z=X/(X+Y) has distribution Beta(α,β). With this transformation, it should only take twice as much time as your gamma distribution test.

Note: The above assumes the most common representation of the gamma distribution, Gamma(shape,scale). Be aware that different implementations of the gamma distribution random generator will vary with the meaning and order of the arguments.

OTHER TIPS

If you want a distribution that is very Beta-like, but has a very simple closed-form inverse CDF, it's worth considering the Kumaraswamy distribution:

http://en.wikipedia.org/wiki/Kumaraswamy_distribution

It's used as an alternative to the Beta distribution when a large number of random samples are required quickly.

Try compiling with optimization. Using a flag -O3 will usually speed things up. See this post on optimisation flags or this overview for slightly more detail.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top