Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DistributionFitTest[] for custom distributions in Mathematica

I have PDFs and CDFs for two custom distributions, a means of generating RandomVariates for each, and code for fitting parameters to data. Some of this code I've posted previously at:

Calculating expectation for a custom distribution in Mathematica

Some of it follows:

nlDist /: PDF[nlDist[alpha_, beta_, mu_, sigma_], 
   x_] := (1/(2*(alpha + beta)))*alpha* 
   beta*(E^(alpha*(mu + (alpha*sigma^2)/2 - x))* 
      Erfc[(mu + alpha*sigma^2 - x)/(Sqrt[2]*sigma)] + 
     E^(beta*(-mu + (beta*sigma^2)/2 + x))* 
      Erfc[(-mu + beta*sigma^2 + x)/(Sqrt[2]*sigma)]); 

nlDist /: 
  CDF[nlDist[alpha_, beta_, mu_, sigma_], 
   x_] := ((1/(2*(alpha + beta)))*((alpha + beta)*E^(alpha*x)* 
        Erfc[(mu - x)/(Sqrt[2]*sigma)] - 
       beta*E^(alpha*mu + (alpha^2*sigma^2)/2)*
        Erfc[(mu + alpha*sigma^2 - x)/(Sqrt[2]*sigma)] + 
       alpha*E^((-beta)*mu + (beta^2*sigma^2)/2 + alpha*x + beta*x)*
        Erfc[(-mu + beta*sigma^2 + x)/(Sqrt[2]*sigma)]))/ 
   E^(alpha*x);         

dplDist /: PDF[dplDist[alpha_, beta_, mu_, sigma_], x_] := 
  PDF[nlDist[alpha, beta, mu, sigma], Log[x]]/x;
dplDist /: CDF[dplDist[alpha_, beta_, mu_, sigma_], x_] := 
  CDF[nlDist[alpha, beta, mu, sigma], Log[x]];

nlDist /: DistributionDomain[nlDist[alpha_, beta_, mu_, sigma_]] := 
 Interval[{-Infinity, Infinity}]

nlDist /: 
    Random`DistributionVector[
    nlDist [alpha_, beta_, mu_, sigma_], n_, prec_] :=
    RandomVariate[ExponentialDistribution[alpha], n, 
        WorkingPrecision -> prec] - 
      RandomVariate[ExponentialDistribution[beta], n, 
        WorkingPrecision -> prec] + 
      RandomVariate[NormalDistribution[mu, sigma], n, 
        WorkingPrecision -> prec];

dplDist /: 
    Random`DistributionVector[
    dplDist[alpha_, beta_, mu_, sigma_], n_, prec_] :=
    Exp[RandomVariate[ExponentialDistribution[alpha], n, 
         WorkingPrecision -> prec] - 
       RandomVariate[ExponentialDistribution[beta], n, 
         WorkingPrecision -> prec] + 
       RandomVariate[NormalDistribution[mu, sigma], n, 
         WorkingPrecision -> prec]];

I can post more of the code if someone needs to see it, but I think the above gives a good sense of the approach so far.

Now I need a way to use DistributionFitTest[] with these distributions in something like this:

DistributionFitTest[data, dplDist[3.77, 1.34, -2.65, 0.40],"HypothesisTestData"]  

Ah, but this doesn't work. Instead I get an error message that starts out as:

"The argument dplDist[3.77,1.34,-2.65,0.4] should be a valid distribution..."

So it appears that DistributionFitTest[] doesn't recognize these distributions as distributions.

I don't see how using TagSet would help in this instance, unless one can use TagSet to give DistributionFitTest[] what it needs to identify these custom distributions.

Can anyone advise me of a way to get this to work? I'd like to use DistributionFitTest[] with custom distributions like this or find some work around to assess goodness of fit.

Thx -- Jagra

like image 390
Jagra Avatar asked Jun 15 '11 17:06

Jagra


1 Answers

Since this question has come up many times, I think it's prime time to furnish some recipes for how to properly cook a custom distribution for v8.

Use TagSet to define for your distribution:

  1. DistributionParameterQ, DistributionParameterAssumptions, DistributionDomain
  2. Define PDF, CDF, SurvivalFunction, HazardFunction
  3. Define random number generation code by coding Random`DistributionVector

Doing so will make everything but parameter estimation work for your distribution.

Your mistake was that dplDist had no DistributionDomain definition, and both nlDist and dplDist did not have DistributionParameterQ and DistributionParameterAssumptions definitions.

I added to your definitions the following:

dplDist /: DistributionDomain[dplDist[alpha_, beta_, mu_, sigma_]] := 
 Interval[{-Infinity, Infinity}]

nlDist /: 
 DistributionParameterQ[nlDist[alpha_, beta_, mu_, sigma_]] := ! 
  TrueQ[Not[
    Element[{alpha, beta, sigma, mu}, Reals] && alpha > 0 && 
     beta > 0 && sigma > 0]]

dplDist /: 
 DistributionParameterQ[dplDist[alpha_, beta_, mu_, sigma_]] := ! 
  TrueQ[Not[
    Element[{alpha, beta, sigma, mu}, Reals] && alpha > 0 && 
     beta > 0 && sigma > 0]]

nlDist /: 
 DistributionParameterAssumptions[
  nlDist[alpha_, beta_, mu_, sigma_]] := 
 Element[{alpha, beta, sigma, mu}, Reals] && alpha > 0 && beta > 0 && 
  sigma > 0

dplDist /: 
 DistributionParameterAssumptions[
  dplDist[alpha_, beta_, mu_, sigma_]] := 
 Element[{alpha, beta, sigma, mu}, Reals] && alpha > 0 && beta > 0 && 
  sigma > 0

And now it worked:

In[1014]:= data = RandomVariate[dplDist[3.77, 1.34, -2.65, 0.40], 100];

In[1015]:= DistributionFitTest[data, dplDist[3.77, 1.34, -2.65, 0.40],
  "HypothesisTestData"]

Out[1015]= HypothesisTestData[<<DistributionFitTest>>]
like image 88
Sasha Avatar answered Oct 21 '22 03:10

Sasha