This would be much better if you provided an example of your data (or all of it), but since you didn't,
# create sample data
set.seed(1)
MyMatrix <- data.frame(group=rep(1:5, each=100),matrix(rnorm(2500),ncol=5))
# generate list of covariance matrices by group
cov.list <- lapply(unique(MyMatrix$group),
function(x)cov(MyMatrix[MyMatrix$group==x,-1],
use="na.or.complete"))
cov.list[1]
# [[1]]
# X1 X2 X3 X4 X5
# X1 0.80676209 -0.09541458 -0.12704666 -0.04122976 0.08636307
# X2 -0.09541458 0.93350463 -0.05197573 -0.06457299 -0.02203141
# X3 -0.12704666 -0.05197573 1.06030090 0.07324986 0.01840894
# X4 -0.04122976 -0.06457299 0.07324986 1.12059428 0.02385031
# X5 0.08636307 -0.02203141 0.01840894 0.02385031 1.11101410
In this example we create a dataframe called MyMatrix
with a six columns. The first is group
and the other five are X1, X2, ... X5
and contain the data we wish to correlate. Hopefully, this is similar to the structure of your dataset.
The operative line of code is:
cov.list <- lapply(unique(MyMatrix$group),
function(x)cov(MyMatrix[MyMatrix$group==x,-1],
use="na.or.complete"))
This takes a list of group id's (from unique(MyMatrix$group)
) and calls the function with each of them. The function calculates the covariance matrix for all columns of MyMatrix
except the first, for all rows in the relevant group, and stores the results in a 5-element list (there are 5 groups in this example).
Note: Regarding how to deal with NA. There are actually several options; you should review the documentation on ?cov to see what they are. The method chosen here, use="na.or.complete"
includes in the calculation only rows which have no NA values in any of the columns. If, for a given group, there are no such rows, cov(...)
returns NA. There are several other choices though.