I don't know a way to get it in matrix form straight away, but I find this solution useful:
dt[, {x = value; dt[, cor(x, value), by = group]}, by=group]
group group V1
1: a a 1.0000000
2: a b 0.1556371
3: b a 0.1556371
4: b b 1.0000000
since you started with a molten dataset and you end up with a molten representation of the correlation.
Using this form you can also choose to just calculate certain pairs, in particular it is a waste of time calculating both off diagonals. For example:
dt[, {x = value; g = group; dt[group <= g, list(cor(x, value)), by = group]}, by=group]
group group V1
1: a a 1.0000000
2: b a 0.1556371
3: b b 1.0000000
Alternatively, this form works just as well for the cross correlation between two sets (i.e. the block off diagonal)
library(data.table)
set.seed(1) # reproducibility
dt1 <- data.table(id=1:4, group=rep(letters[1:2], c(4,4)), value=rnorm(8))
dt2 <- data.table(id=1:4, group=rep(letters[3:4], c(4,4)), value=rnorm(8))
setkey(dt1, group)
setkey(dt2, group)
dt1[, {x = value; g = group; dt2[, list(cor(x, value)), by = group]}, by=group]
group group V1
1: a c -0.39499814
2: a d 0.74234458
3: b c 0.96088312
4: b d 0.08016723
Obviously, if you ultimately want these in matrix form, then you can use dcast
or dcast.data.table
, however, notice that in the above examples you have two columns with the same name, to fix this it is worth renaming them in the j function. For the original problem:
dcast.data.table(dt[, {x = value; g1=group; dt[, list(g1, g2=group, c =cor(x, value)), by = group]}, by=group], g1~g2, value.var = "c")
g1 a b
1: a 1.0000000 0.1556371
2: b 0.1556371 1.0000000