Question

I'm trying to perform the dot product on all possible combinations of vectors. I am able to find all the possible combinations. I just can't quite figure out how the FUN argument in combn() works. Below is my code, thanks for any help!

def=c("Normal.def","Fire.def","Water.def","Electric.def","Grass.def","Ice.def",
       "Fighting.def","Poison.def","Ground.def","Flying.def","Pyschic.def","Bug.def",
       "Rock.def","Ghost.def","Dragon.def","Null.def")

combn(def,2,FUN=def%*%def,simplify=TRUE)
Was it helpful?

Solution 2

Why don't you just matrix multiply the whole thing. For example:

set.seed(1)
vec1 <- sample(1:10)
vec2 <- sample(1:10)
vec3 <- sample(1:10)

rbind(vec1, vec2, vec3) %*% cbind(vec1, vec2, vec3)

produces:

     vec1 vec2 vec3
vec1  385  298  284
vec2  298  385  296
vec3  284  296  385

Where each cell of a matrix is the dot product of the two vectors in the col and row labels. Alternatively, if you really want to do it with combn:

vec.lst <- list(vec1, vec2, vec3)
combn(
  seq_along(vec.lst), 2, 
  FUN=function(idx) c(vec.lst[[idx[[1]]]] %*% vec.lst[[idx[[2]]]])
)

Which produces:

[1] 298 284 296

Notice how those numbers correspond to the upper triangle of the matrix. For small data sets the matrix multiply approach is much faster. For large ones, particularly ones were the vectors are very large but there aren't that many of them, the combn approach might be faster since it doesn't run as many computations (only the upper triangle basically).

OTHER TIPS

Using @BrodieG's sample data, you can just use the crossprod function:

set.seed(1)
vec1 <- sample(1:10)
vec2 <- sample(1:10)
vec3 <- sample(1:10)

crossprod(cbind(vec1, vec2, vec3))
#      vec1 vec2 vec3
# vec1  385  298  284
# vec2  298  385  296
# vec3  284  296  385

Some benchmarks, out of curiosity:

The functions to run:

fun1 <- function() {
  A <- crossprod(do.call(cbind, lst))
  A[upper.tri(A)]
} 
fun2 <- function() {
  A <- do.call(rbind, lst) %*% do.call(cbind, lst)
  A[upper.tri(A)]
} 
fun3 <- function() {
  combn(
    seq_along(lst), 2, 
    FUN=function(idx) c(lst[[idx[[1]]]] %*% lst[[idx[[2]]]])
  )
}

Benchmarking on "small number of large vectors".

library(microbenchmark)

set.seed(1)
n <- 5
lst <- setNames(replicate(n, sample(1:100000), simplify = FALSE), 
                paste0("V", sequence(n)))

microbenchmark(fun1(), fun2(), fun3())
# Unit: milliseconds
#    expr       min        lq    median        uq      max neval
#  fun1()  6.909651  6.992031  8.432346  8.520301 74.12263   100
#  fun2() 17.290101 18.811134 19.144601 21.292544 88.10602   100
#  fun3() 22.841209 24.283113 24.427876 25.820158 91.14007   100

Not being patient enough to benchmark on medium numbers of medium vectors:

set.seed(1)
n <- 1000
lst <- setNames(replicate(n, sample(1:1000), simplify = FALSE), 
                paste0("V", sequence(n)))

system.time(fun1())
#   user  system elapsed 
#  0.245   0.004   0.251 

system.time(fun2())
#   user  system elapsed 
#  0.407   0.016   0.425 

system.time(fun3())
#   user  system elapsed 
# 14.216   0.004  14.339 
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top