Pregunta

This seems like it should be straightforward but I have a data frame and need to extract the correlation of the scores for each possible pair of id across trial (in other words, compare score of id 1 on trial 10 to id 2 on trial 10, id 1 on trial 10 to id 3 on trial 10, and so on. An example data frame is as follows.

id <- c('1','1','1','2', '2', '2', '3', '3', '3')
trial <- c('10','11','12','10', '11', '12', '10', '11', '12')
score<- c('634', '981','101', '621', '31', '124', '827', '404', '92')
d <- data.frame(id, trial, score)

d

 id trial score
  1    10   634
  1    11   981
  1    12   101
  2    10   621
  2    11    31
  2    12   124
  3    10   827
  3    11   404
  3    12    92

The result should be a new matrix with correlations of all possible combinations. Ostensibly it's for evaluating score reliability across ids.

The data is about 10000 lines long which causes R to choke up. I've looked in the forums here and tried to figure it out using comb or outer but got confused by the syntax. Any help would be much appreciated!

¿Fue útil?

Solución

Based on @Roland's idea, but using R base function xtabs

> d$score <- as.numeric(as.character(d$score))
> cor(xtabs(score ~ trial + id, data=d))
            1           2         3
1  1.00000000 -0.02568439 0.5295394
2 -0.02568439  1.00000000 0.8344046
3  0.52953942  0.83440458 1.0000000

Otros consejos

One way to achieve this could be by using data.table. You can use the following

library(data.table)
d.t <- data.table(d)
setkey(d.t,"trial","id")

And then something like this should help.

temp <- cor(as.vector(d.t[J("10","1")]$score),as.vector(d.t[J("10","2")]$score))

Post this could put a loop around this or use sapply and then rbind the results into a matrix/data frame

HTH

If you don't have too many ids, I would reshape the data here and use that cor accepts a data.frame as input:

d$score <- as.numeric(as.character(d$score))
library(reshape2)
d1 <- dcast(d,trial~id)
cor(d1[,-1])
#            1           2         3
#1  1.00000000 -0.02568439 0.5295394
#2 -0.02568439  1.00000000 0.8344046
#3  0.52953942  0.83440458 1.0000000
Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top