Question

I have used the following R packages: mice, mitools, and pROC.

Basic design: 3 predictor measures with missing data rates between 5% and 70% on n~1,000. 1 binary target outcome variable.

Analytic Goal: Determine the AUROC of each of the 3 predictors.

I used the mice package to impute data and now have m datasets of imputed data. Using the following command, I am able to get AUROC curves for each of m datasets:

fit1<-with(imp2, (roc(target, symptom1, ci=TRUE)))
fit2<-with(imp2, (roc(target, symptom2, ci=TRUE))) 
fit3<-with(imp2, (roc(target, symptom3, ci=TRUE)))

I can see the estimates for each of m datasets without any problems.

fit1
fit2
fit3

To combine the parameters, I attempted to use mitools

>summary(pool(fit1))
>summary(pool(fit2))
>summary(pool(fit3))

I get the following error message: "Error in pool(fit): Object has no vcov() method".

When combining coefficient estimates from m datasets, my understanding is that this is a simple average of the coefficients. However, the error term is more complex.

My question: How do I pool the "m" ROC parameter estimates (AUROC and 95% C.I. or S.E.) to get an accurate estimate of the error term for significance testing/95% Confidence Intervals?

Thank you for any help in advance.

No correct solution

OTHER TIPS

I think the following works to combine the estimates.

pROC produces a point estimate for the AUROC as well as a 95% Confidence Interval.

To combine the AUROC from m imputation dataets, it is simply averaging the AUROC.

To create an appropriate standard error estimate and then a 95% C.I., I converted the 95% C.I.s into S.E. Using the standard formulas (Multiple Imputation FAQ, I computed the within, between, and total variance for the estimate. Once I had the standard error, I converted that back to a 95% C.I.

If anyone has any better suggestions, I would very much appreciate it.

I would use bootstrapping with the boot package to assess the different sources of variance. For instance for the variance due to imputation, you could use something like this:

bootstrap.imputation <- function(d, i, symptom){
    sampled.data <- d[i,]
    imputed.data <- ... # here the code you use to generate one imputed dataset, but apply it to sampled.data

    auc(roc(imputed.data$target, imputed.data[[symptom]]))
} 

boot.n <- 2000
boot(dataset, bootstrap.imputation, boot.n, "symptom1") # symptom1 is passed with ... to bootstrap.imputation
boot(dataset, bootstrap.imputation, boot.n, "symptom2") 
boot(dataset, bootstrap.imputation, boot.n, "symptom3") 

Then you can then do the same to assess the variance of the AUC. Impute your data, and apply the bootstrap again (or you can do with the built-in functions of pROC).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top