Explanation for error with lmer but not glmer: Error in checkNlevels(reTrms$flist, n = n, control) :

StackOverflow https://stackoverflow.com/questions/22662870

  •  21-06-2023
  •  | 
  •  

Question

I have a model like:

lmer(y ~ x + z + (1|g) + (1|dummy) , data = dat)

Where dummy is an individual level random effect accounting for overdispersion, i.e. factor(1:nrow(dat))

when running this I get the following error which I do not understand. Does it mean I have overfitted my model?

Error in checkNlevels(reTrms$flist, n = n, control) : number of levels of each grouping factor must be < number of observations

When I run this model however with poisson family I do not get this error e.g.

glmer(y ~ x + z + (1|g) + (1|dummy) , data = dat, family = poisson)

I know the individual level random effect may not even make sense in the Gaussian GLMM but I want to know if the Poisson example is hiding something from me, suggesting the model is over fitted?

Was it helpful?

Solution

Overdispersion isn't really a useful concept in the normal model because it is already fitting a variance for the variability at the observation level. So the error message is telling you that you can't have a grouping factor at the observation level. In that sense, yes, you are trying to overfit your model.

In a poisson (or other glm) model, however, it does make sense, because the variability at the observation level is fixed according to whatever the variance term in the glm is, so it does make sense to add an additional variance term to the model to account for any extra variability at the observation level. Hence glmer does not do the same check that lmer does.

OTHER TIPS

This is really a bit more of a CrossValidated question, but:

  • the definition of a linear mixed model includes a residual variance term by default; in other words, the model already estimates the amount of dispersion in the residuals. If you include an observation-level random effect in the formula, its variance will be confounded with the residual variance (the two terms will be jointly unidentifiable; there will be a set of equally good fits to the model where the two variances sum to a constant). This problem won't necessarily affect the rest of the model badly, but inclusion of such an unidentifiable term usually means that you're making a mistake/don't quite understand what you're doing, so the default behaviour is to return an error in this case. If you do have a good reason to fit this model, you can use lmerControl to override the error.
  • the definition of the most common generalized linear mixed models (i.e. Poisson and binomial) does not include a residual variance term -- in particular, the "scale parameter" of the Poisson and binomial models is fixed to 1 by definition. Thus, it is possible to have overdispersion in the data. One of the standard ways to deal with this is to add an observation-level random effect as you have done above. Since there is no default residual variance term in the model, there's nothing suspicious about this, and glmer doesn't complain.

See e.g. http://glmm.wikidot.com/faq#overdispersion_est for more details and references.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top