Question

If I want to use the the boot() function from R's boot package for calculating the significance of the Pearson correlation coefficient between two vectors, should I do it like this:

boot(re1, cor, R = 1000)

where re1 is a two column matrix for these two observation vectors? I can't seem to get this right because cor of these vectors is 0.8, but the above function returns -0.2 as t0.

Was it helpful?

Solution

Just to emphasize the general idea on bootstrapping in R, although @caracal already answered your question through his comment. When using boot, you need to have a data structure (usually, a matrix) that can be sampled by row. The computation of your statistic is usually done in a function that receives this data matrix and returns the statistic of interest computed after resampling. Then, you call the boot() that takes care of applying this function to R replicates and collecting results in a structured format. Those results can be assessed using boot.ci() in turn.

Here are two working examples with the low birth baby study in the MASS package.

require(MASS)
data(birthwt)
# compute CIs for correlation between mother's weight and birth weight
cor.boot <- function(data, k) cor(data[k,])[1,2]
cor.res <- boot(data=with(birthwt, cbind(lwt, bwt)), 
                statistic=cor.boot, R=500)
cor.res
boot.ci(cor.res, type="bca")
# compute CI for a particular regression coefficient, e.g. bwt ~ smoke + ht
fm <- bwt ~ smoke + ht
reg.boot <- function(formula, data, k) coef(lm(formula, data[k,]))
reg.res <- boot(data=birthwt, statistic=reg.boot, 
                R=500, formula=fm)
boot.ci(reg.res, type="bca", index=2) # smoke
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top