I have a problem with fitdistr{MASS} function in R. I have this vector:
a <- c(26,73,84,115,123,132,159,207,240,241,254,268,272,282,300,302,329,346,359,367,375,378, 384,452,475,495,503,531,543,563,594,609,671,687,691,716,757,821,829,885,893,968,1053,1081,1083,1150,1205,1262,1270,1351,1385,1498,1546,1565,1635,1671,1706,1820,1829,1855,1873,1914,2030,2066,2240,2413,2421,2521,2586,2727,2797,2850,2989,3110,3166,3383,3443,3512,3515,3531,4068,4527,5006,5065,5481,6046,7003,7245,7477,8738,9197,16370,17605,25318,58524)
and I want to fit a gamma distribution to the data with a command:
fitted.gamma <- fitdistr(a, "gamma")
but I have such error:
Error in optim(x = c(26, 73, 84, 115, 123, 132, 159, 207, 240, 241, 254, :
non-finite finite-difference value [1]
In addition: Warning messages:
1: In densfun(x, parm[1], parm[2], ...) : NaNs produced
2: In densfun(x, parm[1], parm[2], ...) : NaNs produced
3: In densfun(x, parm[1], parm[2], ...) : NaNs produced
4: In densfun(x, parm[1], parm[2], ...) : NaNs produced
So I tried with initializing the parameters:
(fitted.gamma <- fitdistr(a, "gamma", start=list(1,1)))
The object fitted.gamma is created but when printed, creates an error:
Error in dn[[2L]] : subscript out of bounds
Do you know what is happening or maybe know some other R functions to fit univariate distributions by MLE?
Thanks in advance for any help or response.
Kuba
Always plot your stuff first, you scaling is far offfffffff.
library(MASS)
a <- c(26,73,84,115,123,132,159,207,240,241,254,268,272,282,300,302,329,346,359,367,375,378, 384,452,475,495,503,531,543,563,594,609,671,687,691,716,757,821,829,885,893,968,1053,1081,1083,1150,1205,1262,1270,1351,1385,1498,1546,1565,1635,1671,1706,1820,1829,1855,1873,1914,2030,2066,2240,2413,2421,2521,2586,2727,2797,2850,2989,3110,3166,3383,3443,3512,3515,3531,4068,4527,5006,5065,5481,6046,7003,7245,7477,8738,9197,16370,17605,25318,58524)
## Ooops, rater wide
plot(hist(a))
fitdistr(a/10000,"gamma") # gives warnings
# No warnings
fitted.gamma <- fitdistr(a/10000, dgamma, start=list(shape = 1, rate = 0.1),lower=0.001)
Now you can decide what to do with the scaling
For data that clearly fits the gamma distribution, but is on the wrong scale (i.e., as if it had been multiplied/divided by a large number), here's an alternative approach to fitting the gamma distribution:
fitgamma <- function(x) {
# Equivalent to `MASS::fitdistr(x, densfun = "gamma")`, where x are first rescaled to
# the appropriate scale for a gamma distribution. Useful for fitting the gamma distribution to
# data which, when multiplied by a constant, follows this distribution
if (!requireNamespace("MASS")) stop("Requires MASS package.")
fit <- glm(formula = x ~ 1, family = Gamma)
out <- MASS::fitdistr(x * coef(fit), "gamma")
out$scaling_multiplier <- unname(coef(fit))
out
}
Usage:
set.seed(40)
test <- rgamma(n = 100, shape = 2, rate = 2)*50000
fitdistr(test, "gamma") # fails
dens_fit <- fitgamma(test) # successs
curve(dgamma(x, 2, 2), to = 5) # true distribution
curve(dgamma(x, dens_fit$estimate['shape'], dens_fit$estimate['rate']), add=TRUE, col=2) # best guess
lines(density(test * dens_fit$scaling_multiplier), col = 3)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With