Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

FloatingPointError from PyMC in sampling from a Dirichlet distribution

After being unsuccessful in using decorators to define the stochastic object of the "logarithm of an exponential random variable", I decided to manually write the code for this new distribution using pymc.stochastic_from_dist. The model that I am trying to implement is available here(the first model): enter image description here

Now when I try to sample the log(alpha) using MCMC Metropolis and with a Normal distribution as proposal(as it has been stated in the following picture as the sampling method), I am getting the following error:

  File "/Library/Python/2.7/site-packages/pymc/distributions.py", line 980, in rdirichlet
    return (gammas[0]/gammas[0].sum())[:-1]

FloatingPointError: invalid value encountered in divide

Although the times that the sampling doesn't run into error the sampling histograms are matching with the ones in this paper. My hierarchical model is:

"""
A Hierarchical Bayesian Model for Bags of Marbles

logalpha ~ logarithm of an exponential distribution with parameter lambd
beta ~ Dirichlet([black and white ball proportions]:vector of 1's)
theta ~ Dirichlet(alpha*beta(vector))

"""

import numpy as np
import pymc
from scipy.stats import expon
lambd=1.
__all__=['alpha','beta','theta','logalpha']
#------------------------------------------------------------
# Set up pyMC model: logExponential
# 1 parameter: (alpha)

def logExp_like(x,explambda):
    """log-likelihood for logExponential"""
    return -lambd*np.exp(x)+x
def rlogexp(explambda, size=None):
    """random variable from logExponential"""
    sample=np.random.exponential(explambda,size)
    logSample=np.log(sample)
    return logSample
logExponential=pymc.stochastic_from_dist('logExponential',logp=logExp_like,
                                          random=rlogexp,
                                          dtype=np.float,
                                          mv=False)
#------------------------------------------------------------
#Defining model parameteres alpha and beta.
beta=pymc.Dirichlet('beta',theta=[1,1])
logalpha=logExponential('logalpha',lambd)

@pymc.deterministic(plot=False)
def multipar(a=logalpha,b=beta):
    out=np.empty(2)
    out[0]=(np.exp(a)*b)
    out[1]=(np.exp(a)*(1-b))
    return out
theta=pymc.Dirichlet('theta',theta=multipar)

And my test sampling code is:

from pymc import Metropolis
from pymc import MCMC
from matplotlib import pyplot as plt
import HBM
import numpy as np
import pymc
import scipy
M=MCMC(HBM)
M.use_step_method(Metropolis,HBM.logalpha, proposal_sd=1.,proposal_distribution='Normal')
M.sample(iter=1000,burn=200)

When I check the values of theta passed to gamma distribution in line 978 of distributions.py I see that there are not zero but small values! So I don't know how to prevent this floating point error?

like image 582
Cupitor Avatar asked Oct 17 '13 18:10

Cupitor


1 Answers

I found this nugget in their documentation:

The stochastic variable cutoff cannot be smaller than the largest element of D, otherwise D’s density would be zero. The standard Metropolis step method can handle this case without problems; it will propose illegal values occasionally, but these will be rejected.

This would lead me to believe that the dtype=np.float (which is essential has the same range as float), may not be the method you want to go about. The documentation says it needs to be a numpy dtype, but it just needs to be a dtype that converts to a numpy dtype object and in Python2 (correct me if I'm wrong) number dtypes were fixed size types meaning they're the same. Maybe utilizing the Decimal module would be an option. This way you can set the level of precision to encapsulate expected value ranges, and pass it to your extended stochastic method where it would be converted.

from decimal import Decimal, getcontext
getcontext().prec = 15
dtype=Decimal

I don't know this wouldn't still be truncated once the numpy library got a hold of it, or if it would respect the inherited level of precision. I have no accurate method of testing this, but give it a try and let me know how that works for you.

Edit: I tested the notion of precision inheritance and it would seem to hold:

>>> from decimal import Decimal, getcontext
>>> getcontext().prec = 10
>>> Decimal(1) / Decimal(7)
Decimal('0.1428571429')
>>> np.float(Decimal(1) / Decimal(7))
0.1428571429
>>> getcontext().prec = 15
>>> np.float(Decimal(1) / Decimal(7))
0.142857142857143
>>> 
like image 125
Xinthral Avatar answered Nov 06 '22 01:11

Xinthral