So I've been working on this for a while and testing. For anyone else stuck on a similar problem, here are two more optimized versions of the code. I've significantly decreased computational time, however it still blows up with too many data entries. My next step would be to attempt to implement this with Rcpp and if possible make use of the 12 cores I have available (with the end goal being to compute 1-2 million entries in a reasonable time-frame). Not sure about the best way of proceeding on either point, but here is my code. Thank you for the help!
##################################
##############Optimized code
t.m<-t(test_euclid_log)
knn_log <- function (vec,k) {
sum(vec[1:k+1])
}
knn_log <- cmpfun(knn_log)
distf <- function(x,t.m) sqrt(colSums((x - t.m)^2))
distf <- cmpfun(distf)
myfunc <- function (tab) {
rowsums<-numeric(nrow(tab))
knnsums_log <- matrix(nrow=nrow(tab),ncol=4)
for(i in 1:nrow(tab)) {
q<-apply(tab[i,],1,distf,t.m=t.m)
rowsums[i] <- colSums(q)
q<-sort(q)
for (kn in 1:4) {
knnsums_log[i,kn] <- knn_log(q,kn)
}
}
return(cbind(rowsums,knnsums_log))
}
myfunc <- cmpfun(myfunc)
system.time(output <- myfunc(t))
And my attempt using applys:
###############Vectorized
myfuncvec <- function (tab) {
kn<-c(1:4)
q<-apply(tab,1,distf,t.m=t.m)
rowsums <- colSums(q)
q<-sort(q)
knnsums_log <- vapply(kn,knn_log,vec=q,FUN.VALUE=c(0))
return(c(rowsums,knnsums_log))
}
myfuncvec <- cmpfun(myfuncvec)
t1<-split(t,row(t))
system.time(out <- vapply(t1,myfuncvec,FUN.VALUE=c(0,0,0,0,0)))
out <- t(out)
For reference the first of the codes seems to be faster.