怎么做：用“块”（或 - “重复测量”？！）的相关性？

https://stackoverflow.com/questions/2336056

22-09-2019
|

题

我具有以下设置来分析：我们有大约150受试者，并且对于每个受试者我们进行了对测试（在不同条件下）的18倍。测试的18个不同的条件是互补的，以这样的方式，使得如果我们在何处平均值测试（每个受试者），我们会得到试验之间没有相关性（主体之间）。我们要知道什么是测试之间的相关性（和P值），在学科范围内，但在所有的科目。

我现在这样做的方式是执行为每个主题的相关性，然后看看接收这样的相关性的分布，以看它是否平均值不同然后0。但我怀疑有可能是回答同样的问题更好的方法（有人跟我说了一些关于“地理相关性”，而是一个浅层搜索并没有帮助）。

PS：我知道有可能是这里的地方做一些混合模型的，但我宁愿提出一个“关联”，并确定如何从混合模式提取这样的输出是不是

此外，这里是一个短路伪代码，给什么我谈论的一个想法：

attach(longley)
N <- length(Unemployed)
block <- c(
        rep( "a", N),
        rep( "b", N),
        rep( "c", N)
        )

Unemployed.3 <- c(Unemployed + rnorm(1),
                    Unemployed + rnorm(1),
                    Unemployed + rnorm(1))

GNP.deflator.3 <- c(GNP.deflator + rnorm(1),
                    GNP.deflator + rnorm(1),
                    GNP.deflator + rnorm(1))

cor(Unemployed, GNP.deflator)
cor(Unemployed.3, GNP.deflator.3)
cor(Unemployed.3[block == "a"], GNP.deflator.3[block == "a"])
cor(Unemployed.3[block == "b"], GNP.deflator.3[block == "b"])
cor(Unemployed.3[block == "c"], GNP.deflator.3[block == "c"])
(I would like to somehow combine the last three correlations...)

任何想法将受到欢迎。

最佳，塔尔

解决方案

我同意特里斯坦 - 你正在寻找ICC。从标准实现，唯一的区别是，两个评价者（测试）反复评估每个主题。有可能是允许的实现。与此同时这里是另一种方法获得的相关性。

可以使用“一般线性模型”，这是线性模型，明确地允许残差之间的相关性的概括。下面使用gls包的nlme函数实现这的代码。我相信还有其他的方法为好。要使用此功能，我们必须首先重塑数据为“长”格式。我也改变了变量名x和y为简单起见。我也用+rnorm(N)，而不是在你的代码+rnorm(1)，因为那是我想你的意思。

library(reshape)
library(nlme)
dd <- data.frame(x=Unemployed.3, y=GNP.deflator.3, block=factor(block))
dd$occasion <- factor(rep(1:N, 3))  # variable denoting measurement occasions
dd2 <- melt(dd, id=c("block","occasion"))  # reshape

# fit model with the values within a measurement occasion correlated
#   and different variances allowed for the two variables
mod <- gls(value ~ variable + block, data=dd2, 
           cor=corSymm(form=~1|block/occasion), 
           weights=varIdent(form=~1|variable))  
# extract correlation
mod$modelStruct$corStruct

在建模框架可以用似然比检验得到p值。 nlme也可以给你的置信区间：

mod2 <- gls(value ~ variable + block, data=dd2, 
           weights=varIdent(form=~1|variable))  
anova(mod, mod2)   # likelihood-ratio test for corr=0

intervals(mod)$corStruct  # confidence interval for the correlation

其他提示

如果我正确地理解你的问题，你有兴趣在多个计算之间的内相关试验。有一个在 PSY 包的实现，虽然我有不使用它。

如果您想在相关估计进行推理，你可以引导的主题。只要确保保持测试一起对每个样品。

我不是专家，但在我看来像你想要什么。它的自动化的，短代码，给出作为示例上述相同的相关性，并产生p值

> df = data.frame(block=block, Unemployed=Unemployed.3,
+ GNP.deflator=GNP.deflator.3)
> require(plyr)
Loading required package: plyr
> ddply(df, "block", function(x){
+   as.data.frame(
+     with(x,cor.test(Unemployed, GNP.deflator))[c("p.value","estimate")]
+ )})
  block    p.value  estimate
1     a 0.01030636 0.6206334
2     b 0.01030636 0.6206334
3     c 0.01030636 0.6206334

要看到所有的细节，这样做：

> dlply(df, "block", function(x){with(x,cor.test(Unemployed, GNP.deflator))})
$a

    Pearson's product-moment correlation

data:  Unemployed and GNP.deflator 
t = 2.9616, df = 14, p-value = 0.01031
alternative hypothesis: true correlation is not equal to 0 
95 percent confidence interval:
 0.1804410 0.8536976 
sample estimates:
      cor 
0.6206334 


$b

    Pearson's product-moment correlation

data:  Unemployed and GNP.deflator 
t = 2.9616, df = 14, p-value = 0.01031
alternative hypothesis: true correlation is not equal to 0 
95 percent confidence interval:
 0.1804410 0.8536976 
sample estimates:
      cor 
0.6206334 


$c

    Pearson's product-moment correlation

data:  Unemployed and GNP.deflator 
t = 2.9616, df = 14, p-value = 0.01031
alternative hypothesis: true correlation is not equal to 0 
95 percent confidence interval:
 0.1804410 0.8536976 
sample estimates:
      cor 
0.6206334 


attr(,"split_type")
[1] "data.frame"
attr(,"split_labels")
  block
1     a
2     b
3     c

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow