Question

I have a GLMM object fit using the glmer function in R and want to perform k-fold cross validation. For simple GLMs I have used the CVbinary function from the DAAG pkg as seen below.

> SimpleGLM <- glm(Res ~ Var1 + Var2, data = Data, family=binomial)
> CVbinary(SimpleGLM,  nfolds=10, print.details=TRUE)

Fold:  3 2 4 1 7 10 6 9 5 8
Internal estimate of accuracy = 0.828
Cross-validation estimate of accuracy = 0.827

However, when a random term for IndID is added to the model an error (below) results from the S4 class of a model fit with glmer.

GLMMod <- glmer(Res ~ Var1 + Var2 + (1|IndID), data = Data, family=binomial)
> CVbinary(GLMMod ,  nfolds=10, print.details=TRUE)

Error in obj$data : $ operator not defined for this S4 class

I have been looking online and have been unable to find a function similar to CVbinary that works with S4 objects, but wanted to double check here before I code it manually.

In short, (assuming I am correctly interpreting the R error) is there a function that performs k-fold cross validation on S4 objects?

Was it helpful?

Solution

You would be well-advised to examine the statistical assumptions underlying the question. When the experts approach this for assessment of p-values for individual factors, they emphasize the need to do bootstrapping with proper attention to the study design implied by the random factor specification. See the "draft" GLMM FAQ. (Credit to @BenBolker for authorship and maintenance of that resource. It has expanded greatly in the last year and now even has some kewl graphics. It's on its way to becoming a book chapter.) The author of DAAG has also published DAAGxtras which has a compareModels function which you could set up after using the newly introduced predict methods in pkg:lme4

There's also the resource of the mixed-models-in-R Archive: http://markmail.org/search/?q=+list%3Aorg.r-project.r-sig-mixed-models+cross-validation

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top