I have been trying to run a power analysis for study design that will be analyzed by a 4 factor Repeated Measures ANOVA where all factors are within-subjects. After a lot of searching and some help on Cross Validated, it is clear that I need to
- Do a simulation where I generate data based on some pilot data I have
- Run it through the appropriate anova model
- And then iterate to find the average power.
The code pasted below was found at this link, and it seems to do exactly what I want to do, albeit for only a 2 factor within-subjects ANOVA.
However, I am having trouble understanding all of the details and what specific lines do because I am very new to R. I would be very grateful if anyone could shed some light on these lines of code:
My questions:
Why is the intercept for all conditions set to -1? Is that standard to this model?
intercept = rep(-1, nconds)
Do the values in these columns represent the levels of the x1 and x2 factors or are they SDs etc? If they are the levels, shouldn't the levels for x2
be (0, .5, 0, .5)
?
true.effect.x1 = c(0, 0, .5, .5)
true.effect.x2 = c(0, .5, .5, .5)
What mean and SDs do I take from my pilot data (from multiple subjects) for these three lines of code?
#relatively large subject-specific variance in intercepts
sub.intercept = rep(rnorm(nsub, mean=0, sd=2), times=1, each=nconds)
#relatively small by-subject adjustments to the effects
sub.effect = rep(rnorm(nsub, mean=0, sd=0.05), times=1, each=nconds)
#unexplained error
error = rnorm(nsub*nconds, mean=0, sd=1)
I know this is a very long question but I would really appreciate any help anyone could provide! Thank you so so much!
FULL CODE
library(ez)
nsub = 30
nconds = 4
nsims = 100
#create an empty matrix that will be populated with p-values
p = matrix(NA, nrow=nsims, ncol=3)
#subject vector
sub = sort(rep(1:nsub, nconds))
#2x2 factorial design
cond = data.frame(x1=c('a','a','b','b'), x2=c('c','d','c','d'))
# fixed effects
intercept = rep(-1, nconds)
true.effect.x1 = c(0, 0, .5, .5)
true.effect.x2 = c(0, 0.5, .5, .5)
X = rep((intercept + true.effect.x1 + true.effect.x2),nsub)
#simulation loop
for (x in 1:nsims)
{
#random effects
#relatively large subject-specific variance in intercepts
sub.intercept = rep(rnorm(nsub, mean=0, sd=2), times=1, each=nconds)
#relatively small by-subject adjustments to the effects
sub.effect = rep(rnorm(nsub, mean=0, sd=0.05), times=1, each=nconds)
#unexplained error
error = rnorm(nsub*nconds, mean=0, sd=1)
#simulated dependent variable
observed.y = X + (sub.intercept + sub.effect + error)
#place everything in a data frame
df = cbind(sub, cond, observed.y)
names(df) = c('sub','x1','x2','y')
df$sub = as.factor(df$sub)
#extract the p-values for each effect from a repeated measure ANOVA (ezANOVA from 'ez' package)
p[x,1] = ezANOVA(data=df, dv=.(y), wid=.(sub), within=.(x1, x2))$ANOVA[1,5]
p[x,2] = ezANOVA(data=df, dv=.(y), wid=.(sub), within=.(x1, x2))$ANOVA[2,5]
p[x,3] = ezANOVA(data=df, dv=.(y), wid=.(sub), within=.(x1, x2))$ANOVA[3,5]
}
###### p-values < .05 ? ######
sig.x1 = ifelse(p[,1] <= .05, 1, 0)
sig.x2 = ifelse(p[,2] <= .05, 1, 0)
sig.int = ifelse(p[,3] <= .05, 1, 0)
###### Histograms ######
par(mfrow=c(3,1))
hist(p[,1], 20, xaxp=c(0, 1, 20), col=2, main = paste('power X1:', mean(sig.x1 * 100), '% with ', nsub, 'subjects'))
hist(p[,2], 20, xaxp=c(0, 1, 20), col=2, main = paste('power X2:', mean(sig.x2 * 100), '% with ', nsub, 'subjects'))
hist(p[,3], 20, xaxp=c(0, 1, 20), col=2, main = paste('power interaction:', mean(sig.int * 100), '% with ', nsub, 'subjects'))