Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to define your own distribution for fitdistr function in R with the help of lmomco function

Tags:

r

distribution

I would like to define my own distributions to use with the fitdistrplus function to fit my monthly precipitation data from now on refered as "month". I am using the “lmomco” function to help me define the distributions, but cannot manage to make it work. For example, I am defining the generalized extreme value (gev) distribution like the following:

dgev<-pdfgev   #functions which are included in lmomco
 pgev<-cdfgev
qgev<-quagev

Since "fitdistrplus" needs the argument "start", which consists of the initial parameter values for the desired distribution, I am estimating these initial values as the following:

lmom=lmoms(month,nmom=5)     #from lmomco package
para=pargev(lmom, checklmom=TRUE)

Now, I finally try using the “fitdist” function to fit “month” to the gev distribution as:

fitgev <- fitdist(month, "gev", start=para[2]) #fitdistrplus

I get an error like the one below. It does not matter which distribution I define with the help of “lmomco”, I get the same error. Could someone give me a hint on what am I doing wrong? Thanks!

fitgev <- fitdist(month, "gev", start=para[2])
[1] "Error in dgev(c(27.6, 97.9, 100.6, 107.3, 108.5, 109, 112.4, 120.9, 137.8,  : \n  unused arguments (para.xi = 196.19347977195, para.alpha = 91.9579520442104, para.kappa = -0.00762962879097294)\n"
attr(,"class")
[1] "try-error"
attr(,"condition")
<simpleError in dgev(c(27.6, 97.9, 100.6, 107.3, 108.5, 109, 112.4, 120.9, 137.8, 138.4, 144.7, 156.8, 163.1, 168.9, 169.1, 171.4, 176.1, 177.1, 178.8, 178.9, 187.2, 190.2, 190.5, 190.8, 191.2, 193.1, 195.2, 198.5, 199.8, 201.7, 206.9, 213.4, 220.7, 240, 253.5, 254.5, 256.1, 256.4, 257.5, 258.3, 261.5, 263.7, 264.7, 279.1, 284.2, 313.1, 314.7, 319.4, 321.6, 328.9, 330.1, 332.2, 358.3, 366.8, 367.9, 403.5, 424.1, 425.9, 457.3, 459.7, 468, 497.1, 508.5, 547.1), para.xi = 196.19347977195, para.alpha = 91.9579520442104,     para.kappa = -0.00762962879097294): unused arguments (para.xi = 196.19347977195, para.alpha = 91.9579520442104, para.kappa = -0.00762962879097294)>
Error in fitdist(month, "gev", start = para[2]) : 
  the function mle failed to estimate the parameters, 
                with the error code 100
like image 430
BeaQM Avatar asked Apr 27 '15 13:04

BeaQM


2 Answers

fitdist is expecting a density/distribution function with named arguments.

library("lmomco")
library("fitdistrplus")
## reproducible:
month <- c(27.6, 97.9, 100.6, 107.3, 108.5,
          109, 112.4, 120.9, 137.8)

Setup:

lmom <- lmoms(month,nmom=5)     #from lmomco package
para <- pargev(lmom, checklmom=TRUE)
dgev <- pdfgev   #functions which are included in lmomco
pgev <- cdfgev
fitgev <- fitdist(month, "gev", start=para[[2]])
## Error in mledist(data, distname, start, fix.arg, ...) : 
##   'start' must specify names which are arguments to 'distr'

It turns out we need to redefine dgev with several additional bits of plumbing that will make both fitdist and pdfgev happy:

 dgev <- function(x,xi,alpha,kappa) {
    pdfgev(x,list(type="gev",para=c(xi,alpha,kappa),source="pargev"))
}
fitgev <- fitdist(month, "gev", start=para[[2]])               
## Fitting of the distribution ' gev ' by maximum likelihood 
## Parameters:
##         estimate Std. Error
## xi    -25.587734         NA
## alpha  75.009842         NA
## kappa   1.593815         NA
like image 73
Ben Bolker Avatar answered Sep 29 '22 11:09

Ben Bolker


Make sure the argument in you cumulative function has variable q: pgev(q, par1, par2) instead of pgev(x, par1, par2)

Because the error message essentially tells you that it couldn't find the variable q.

The keypoint is to use: x as pdf input ;q as cdf input ;p as inverse cdf input

For example, fitting a Gumble Distribution defined by yourself

# Data

x1 <- c(6.4,13.3,4.1,1.3,14.1,10.6,9.9,9.6,15.3,22.1,13.4,
13.2,8.4,6.3,8.9,5.2,10.9,14.4)

# Define pdf, cdf , inverse cdf

dgumbel <- function(x,a,b) 1/b*exp((a-x)/b)*exp(-exp((a-x)/b))
pgumbel <- function(q,a,b) exp(-exp((a-q)/b))
qgumbel <- function(p,a,b) a-b*log(-log(p))

# Fit with MLE
   f1c <- fitdist(x1,"gumbel",start=list(a=10,b=5))

# Goodness of Fit
   gofstat(f1c, fitnames = 'Gumbel MLE')

Reference: https://www.rdocumentation.org/packages/fitdistrplus/versions/0.2-1/topics/fitdist

like image 26
Kevin Avatar answered Sep 29 '22 10:09

Kevin