fast generating 1000 means of sample points from truncated gamma distriution with 1000 different shapes and scales values in R

StackOverflow https://stackoverflow.com/questions/23170828

Pregunta

After searching the forum, I did not find similair questions. If you find one, please let me know. I would really appreciate.

I need to generate 1000 means of sample points from truncated gamma distriution with 1000 different shapes and scales values in R.

My followingg code works but very slow. How to improve the performance ?

library(distr)
library(distrEx)
library(truncdist)
set.seed(RANDOM.SEED)
shape.list <- runif(1000, max = 10, min = 0.01)
scale.list <- runif(1000, max = 100000, min = 100000)
mean.list <- list()
std.dev.list <- list()
for (i in seq(1000)) # very slow
{
  sample.points <- rtrunc(100000, spec="gamma", a = lb.arg, b = ub.arg, 
                         shape = shape.list[[i]], scale = scale.list[[i]])
  sample.mean <- mean(sample.points)
  mean.list <- append(mean.list, sample.mean)
  sample.std.dev <- sd(sample.points)
  std.dev.list <- append(std.dev.list, sample.std.dev)
}

The for loop is very slow and takes very long time.

Any better soluions would be appreciated. Thanks !

¿Fue útil?

Solución

There are several things going on here.

First, you calculate the scales as:

scale.list <- runif(1000, max = 100000, min = 100000)

but since min = max, all the values are identical.

Second, you do not specify lb.arg or ub.arg, so I set them to 20 and 50 arbitrarily.

Third, profiling this code using Rprof shows that > 90% of the time is spent in the qtrunc(...) function, which is called by rtrunc(...). This is because you are generating 100,000 sample points at each iteration, and qtrunc(...) has to sort these. The total run time scales as O(n) where n is the number of sample points. On my system, using n=1000 takes about 7 sec, so using n=100,000 would take 700 sec or about 12 minutes.

My advice would be to try smaller n and see if that really makes a difference. From the central limit theorem we know that the distribution of the mean is asymptotically Normal for large n, regardless of the underlying distribution. I doubt that increasing n from 1000 to 100,000 changes that significantly.

Finally, the idiomatic way to do this in R is [using n=1000]:

f <- function(shape,scale,lb,ub){
  X <- rtrunc(1000, spec="gamma", a=lb, b=ub,shape=shape,scale=scale)
  return(c(mean=mean(X),sd=sd(X)))
}
# set.seed(1)  # use this for a reproducible sample
result <- mapply(f,shape.list,scale.list, lb=20,ub=50)
result.apply <- data.frame(t(result))

which produces a data frame with two columns: the mean and the sd for each shape/scale. By setting the seed to a fixed value just prior to running mapply(...), and doing the same just prior to running your for loop, you can show that these both produce identical results.

Otros consejos

Unfortunately, there is just no way to optimize your task. There sure are slight possible optimizations around the generation of random points from the truncated distribution... but here is the thing: generating 10^8 points or so from a random distribution WILL BE very slow.

Here are a few optimizations that I tried though, which speed a little bit the process:

  1. generating all the random numbers from the uniform distribution in [a,b] at once

  2. getting back to the source of the definition of a truncated distribution, without relying on the "fancy" packages (distr, distEx, truncdist)

  3. compiling my code to speed it up

Code:

# your original code, in a function
func = function()
{
  library(distr)
  library(distrEx)
  library(truncdist)
  set.seed(42)
  shape.list <- runif(1000, max = 10, min = 0.01)
  scale.list <- runif(1000, max = 100000, min = 100000)
  mean.list <- list()
  std.dev.list <- list()
  ITE.NUMBER = 10
  POINTS.NUMBER = 100000
  A = 0.25
  B = 0.5
  for (i in seq(ITE.NUMBER)) # very slow
  {
    sample.points <- rtrunc(POINTS.NUMBER, spec="gamma", a = A, b = B, 
                            shape = shape.list[[i]], scale = scale.list[[i]])
    sample.mean <- mean(sample.points)
    mean.list <- append(mean.list, sample.mean)
    sample.std.dev <- sd(sample.points)
    std.dev.list <- append(std.dev.list, sample.std.dev)
  }
}


# custom code
func2 = function()
{
  set.seed(42)
  shape.list <- runif(1000, max = 10, min = 0.01)
  scale.list <- runif(1000, max = 100000, min = 100000)
  mean.list <- list()
  std.dev.list <- list()
  ITE.NUMBER = 10
  POINTS.NUMBER = 100000
  A=0.25
  B=0.5
  # 
  # we generate all the random number at once, outside the loop
  #
  r <- runif(POINTS.NUMBER*ITE.NUMBER, min = 0, max = 1)
  for (i in seq(ITE.NUMBER)) # still very slow
  {
    #
    # back to the definition of the truncated gamma
    #
    sample.points <- qgamma(pgamma(A,  shape = shape.list[[i]], scale = scale.list[[i]]) +
                            r[(1+POINTS.NUMBER*(ITE.NUMBER-1)):(POINTS.NUMBER*(ITE.NUMBER))] *
                              (pgamma(B,  shape = shape.list[[i]], scale = scale.list[[i]]) - 
                                 pgamma(A,  shape = shape.list[[i]], scale = scale.list[[i]])),
                            shape = shape.list[[i]], scale = scale.list[[i]])
    sample.mean <- mean(sample.points)
    mean.list <- append(mean.list, sample.mean)
    sample.std.dev <- sd(sample.points)
    std.dev.list <- append(std.dev.list, sample.std.dev)
  }
}

#
# maybe a compilation would help?   
#  
require(compiler)
func2_compiled <- cmpfun(func2)

require(microbenchmark)

microbenchmark(func2(), func2_compiled(), func(),  times=10)

Which gives the following:

Unit: seconds
             expr      min       lq   median       uq      max neval
          func2() 1.462768 1.465561 1.475692 1.489235 1.532693    10
 func2_compiled() 1.403956 1.477983 1.487945 1.499133 1.515504    10
           func() 1.457553 1.478829 1.502671 1.510276 1.513486    10

Conclusions:

  1. As said before, there is almost no room for improvement: your task is just very resource-demanding and there is nothing to be done around that.

  2. Compilation almost made things worse... which is expected: there is no silly use of bad programming techniques here (e.g. big ugly for loops)

  3. If you are really looking for speed improvement, you may be better off with another language, though I doubt that you would be able to achieve significantly better performances..

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top