Question

I am using the following function to estimate the Kernel density of some data

KDens = function(x,h,N) {
         fx = matrix(0,N,1)
         Kx = AK(x,h,N)
         for (i in 1:N) {
         fx[i] = sum(Kx[i,], na.rm=T)/N
         }
         return(fx) }

I know this is not the first question about speeding up a loop. I checked around in the site, I found that sometimes using some apply function is faster, but this is not always the case if you manage to correctly set the loop.

In the above code, every "not needed thing" is left out of the loop, as - if I understood correctly - suggested to speed up computation. However, I made a comparison between the above KDens function and the density function implemented in R by default. Well, density needs 1 or 2 seconds, while KDens needs ~30 on my machine.

trywiththis <- rnorm(4800)
x = trywiththis
N = length(trywiththis)
h = 1.059*sd(trywiththis , na.rm=T)*(N^(-0.2))

EDIT: the information I provided was not complete

kerf = function(x){ return(dnorm(x)) }
ker = function(x,x0,h){
       temp = kerf((x-x0)/h)/h
       return(temp)
       }

AK = function(x,h,N) {
      K = array(0,c(N,N))                 
         for (i in 1:N) {
         for (j in 1:N) {
         K[i,j] = ker(x[i],x[j],h) 
       }}
       return(K) }

Suppose I want to speed up the KDens function, how could I do that ?

Was it helpful?

Solution

Try this... For your original 4800 length dataset it takes 2.5 seconds.

KDens2 = function(x,h,N) {
Kx <- outer( x , x , FUN = function(x,y) dnorm( ( x-y ) / h ) / h )
fx <- as.matrix( rowSums( Kx ) / N , ncol = 1 )
return( fx )
}

Testing

set.seed(1)
trywiththis <- rnorm(480)
x = trywiththis
N = length(trywiththis)
h = 1.059*sd(trywiththis , na.rm=T)*(N^(-0.2))

#Produce same result? (posibly not identical because of 'dnorm' function)
all.equal( KDens(x,h,N) , KDens2(x,h,N) )
[1] TRUE

#Rough timing on N=480 length data...
system.time( KDens2(x,h,N) )
#   user  system elapsed 
#   0.01    0.00    0.02 

system.time( KDens(x,h,N) )
#   user  system elapsed 
#    2.7     0.0     2.7 

And when N=4800...

system.time( KDens2(x,h,N) )
   user  system elapsed 
   2.33    0.19    2.51
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top