How do I designate a negative binomial error distribution in a GLM using R?

StackOverflow https://stackoverflow.com/questions/18685311

  •  28-06-2022
  •  | 
  •  

Question

I'm constructing a model using the glm() function in R. Let's say that I know that my data have an error distribution that fits a negative binomial distribution.

When I search the R manual for the various families, family=binomial is offered as an option, but negative binomial is not.

In the same section of the R manual (family), NegBinomial is linked in the "See also" section, but it is presented in the context of binomial coefficients (and I'm not even sure what this is referring to).

So, to summarize, I'm hoping to find syntax that would be analogous to glm(y~x, family=negbinomial, data=d,na.omit).

Était-ce utile?

La solution

With an unknown overdispersion parameter, the negative binomial is not part of the negative exponential family, so can't be fitted as a standard GLM (or by glm()). There is a glm.nb() function in the MASS package that can help you ...

library(MASS)
glm.nb(y~x, ...)

If you happen to have a known/fixed overdispersion parameter (e.g. if you want to fit a geometric distribution model, which has theta=1), you can use the negative.binomial family from MASS:

glm(y~x,family=negative.binomial(theta=1), ...)

It might not hurt if MASS::glm.nb were in the "See Also" section of ?glm ...

Autres conseils

I don't believe theta is the overdispersion parameter. Theta is a shape parameter for the distribution and overdispersion is the same as k, as discussed in The R Book (Crawley 2007). The model output from a glm.nb() model implies that theta does not equal the overdispersion parameter:

Dispersion parameter for Negative Binomial(0.493) family taken to be 0.4623841

The dispersion parameter is a different value than theta.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top