Domanda

I am trying to calculate the conditional standard deviation of a matrix B (for every column) based on the values of matrix A.

#conditional function
foo<-function(x,y)
{
  out<-sd(y[abs(x)==1])
  return(out)
}

#create the matrix
A<-matrix(data=c(1,-1,0,1,0,0,0,0,1,1),nrow=5,ncol=2)
B<-matrix(data=c(3,4,5,6,7,8,9,10,11,12),nrow=5,ncol=2)

#run for the first column
foo(A[,1],B[,1])

#run for both columns
apply(X=A, MARGIN=2, FUN=function(x,y) foo(x,y), y=B)

the correct answer is 1.53 and 0.707 which I get when i run directly the foo individually for every column.

However, when i try to run both columns with apply I get this result 3.06 2.94.

Any idea how to change the apply in order to make it work cause I have a large matrix of assets (in xts object). Currently, I am using a for loop but I am sure it can be done with a more efficient way.

Thank you in advance,

Nikos

È stato utile?

Soluzione

The problem with your approach is that you're trying to pass a matrix (B) to your function foo, which is expecting two vectors (x and y).

You could try something like this instead:

sapply(1:ncol(A), function(i) sd(B[as.logical(abs(A[,i])),i]))

[1] 1.5275252 0.7071068

Which is basically just a loop...

Another approach would be if your A and B objects are dataframes, you can use mapply:

A <- as.data.frame(A)
B <- as.data.frame(B)
mapply(foo, A,B)

       V1        V2 
1.5275252 0.7071068 

Benchmarking the two approaches, the sapply route is maybe twice as fast. I can imagine that this is because sapply is just taking a vector of integers as arguments and processing matrices whereas the mapply approach is taking dataframes as arguments (dataframes are slower than matrices and more information to pass the loop than just a single index value). Details:

Unit: microseconds
                                                             expr     min      lq  median       uq      max neval
 sapply(1:ncol(A), function(i) sd(B[as.logical(abs(A[, i])), i])) 101.997 110.080 113.929 118.5480 1515.319  1000
                                              mapply(foo, A2, B2) 191.292 200.529 207.073 215.1555 1707.380  1000
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top