I'm trying to use fitdist ()
function from the fitdistrplus
package to fit my data to different distributions. Let's say that my data looks like:
x = c (1.300000, 1.220000, 1.160000, 1.300000, 1.380000, 1.240000,
1.150000, 1.180000, 1.350000, 1.290000, 1.150000, 1.240000,
1.150000, 1.120000, 1.260000, 1.120000, 1.460000, 1.310000,
1.270000, 1.260000, 1.270000, 1.180000, 1.290000, 1.120000,
1.310000, 1.120000, 1.220000, 1.160000, 1.460000, 1.410000,
1.250000, 1.200000, 1.180000, 1.830000, 1.670000, 1.130000,
1.150000, 1.170000, 1.190000, 1.380000, 1.160000, 1.120000,
1.280000, 1.180000, 1.170000, 1.410000, 1.550000, 1.170000,
1.298701, 1.123595, 1.098901, 1.123595, 1.110000, 1.420000,
1.360000, 1.290000, 1.230000, 1.270000, 1.190000, 1.180000,
1.298701, 1.136364, 1.098901, 1.123595, 1.316900, 1.281800,
1.239400, 1.216989, 1.785077, 1.250800, 1.370000)
Next, if i run fitdist (x, "gamma")
everything is fine, but if I use fitdist (x, "beta")
instead I get the following error:
Error in start.arg.default(data10, distr = distname) :
values must be in [0-1] to fit a beta distribution
Ok, so I'm not native english but as far as I understand this method requires data to be in the range [0,1], so I scale it by using x_scaled = (x-min(x))/max(x)
. This gives me a vector with values in that range that perfectly correlates the original vector x
.
Because of x_scaled
is of class matrix
, I convert into a numeric vector using as.numeric()
. And then fit the model with fitdist(x_scale,"beta")
.
This time I get the following error:
Error in fitdist(x_scale, "beta") :
the function mle failed to estimate the parameters, with the error code 100
So after that I've been doing some search engine queries but I don't find anything useful. Does anybody ave an idea of whats going on wrong here? Thank you
By reading into the source code, it can be found that the default estimation method of fitdist
is mle
, which will call mledist
from the same package, which will construct a negative log-likelihood for the distribution you have chosen and use optim
or constrOptim
to numerically minimize it. If there is anything wrong with the numerical optimization process, you get the error message you've got.
It seems like the error occurs because when x_scaled
contains 0 or 1, there will be some problem in calculating the negative log-likelihood for beta distribution, so the numerical optimization method will simply broke. One dirty trick is to let x_scaled <- (x - min(x) + 0.001) / (max(x) - min(x) + 0.002)
, so there is no 0 nor 1 in x_scaled
, and fitdist
will work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With