Pregunta

I have a problem with fitdistr{MASS} function in R. I have this vector:

a <- c(26,73,84,115,123,132,159,207,240,241,254,268,272,282,300,302,329,346,359,367,375,378, 384,452,475,495,503,531,543,563,594,609,671,687,691,716,757,821,829,885,893,968,1053,1081,1083,1150,1205,1262,1270,1351,1385,1498,1546,1565,1635,1671,1706,1820,1829,1855,1873,1914,2030,2066,2240,2413,2421,2521,2586,2727,2797,2850,2989,3110,3166,3383,3443,3512,3515,3531,4068,4527,5006,5065,5481,6046,7003,7245,7477,8738,9197,16370,17605,25318,58524)

and I want to fit a gamma distribution to the data with a command:

fitted.gamma <- fitdistr(a, "gamma")

but I have such error:

Error in optim(x = c(26, 73, 84, 115, 123, 132, 159, 207, 240, 241, 254,  : 
non-finite finite-difference value [1]
In addition: Warning messages:
1: In densfun(x, parm[1], parm[2], ...) : NaNs produced
2: In densfun(x, parm[1], parm[2], ...) : NaNs produced
3: In densfun(x, parm[1], parm[2], ...) : NaNs produced
4: In densfun(x, parm[1], parm[2], ...) : NaNs produced

So I tried with initializing the parameters:

(fitted.gamma <- fitdistr(a, "gamma", start=list(1,1)))

The object fitted.gamma is created but when printed, creates an error:

Error in dn[[2L]] : subscript out of bounds

Do you know what is happening or maybe know some other R functions to fit univariate distributions by MLE?

Thanks in advance for any help or response.

Kuba

¿Fue útil?

Solución

Always plot your stuff first, you scaling is far offfffffff.

library(MASS)
a <- c(26,73,84,115,123,132,159,207,240,241,254,268,272,282,300,302,329,346,359,367,375,378, 384,452,475,495,503,531,543,563,594,609,671,687,691,716,757,821,829,885,893,968,1053,1081,1083,1150,1205,1262,1270,1351,1385,1498,1546,1565,1635,1671,1706,1820,1829,1855,1873,1914,2030,2066,2240,2413,2421,2521,2586,2727,2797,2850,2989,3110,3166,3383,3443,3512,3515,3531,4068,4527,5006,5065,5481,6046,7003,7245,7477,8738,9197,16370,17605,25318,58524)
## Ooops, rater wide
plot(hist(a))
fitdistr(a/10000,"gamma") # gives warnings
# No warnings
fitted.gamma <- fitdistr(a/10000, dgamma,  start=list(shape = 1, rate = 0.1),lower=0.001)

Now you can decide what to do with the scaling

Otros consejos

For data that clearly fits the gamma distribution, but is on the wrong scale (i.e., as if it had been multiplied/divided by a large number), here's an alternative approach to fitting the gamma distribution:

fitgamma <- function(x) {
  # Equivalent to `MASS::fitdistr(x, densfun = "gamma")`, where x are first rescaled to 
  # the appropriate scale for a gamma distribution. Useful for fitting the gamma distribution to 
  # data which, when multiplied by a constant, follows this distribution
  if (!requireNamespace("MASS")) stop("Requires MASS package.")

  fit <- glm(formula = x ~ 1, family = Gamma)
  out <- MASS::fitdistr(x * coef(fit), "gamma")
  out$scaling_multiplier <- unname(coef(fit))
  out
}

Usage:

set.seed(40)
test <- rgamma(n = 100, shape = 2, rate = 2)*50000
fitdistr(test, "gamma") # fails
dens_fit <- fitgamma(test) # successs
curve(dgamma(x, 2, 2), to = 5) # true distribution
curve(dgamma(x, dens_fit$estimate['shape'], dens_fit$estimate['rate']), add=TRUE, col=2) # best guess
lines(density(test * dens_fit$scaling_multiplier), col = 3)

plot of density

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top