Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what does the 'find_MAP' output mean in pymc3?

Tags:

pymc3

What are the returned values of find_MAP in pymc3 ?

It seems that pymc3.Normal and pymc3.Uniform variables are not considered the same: for pymc3.Normal variables, find_MAP returns a value that looks like the maximum a posteriori probability. But for pymc3.Uniform variables, I get a '_interval' suffix added to the name of the variable and I don't find anywhere in the doc the meaning of the returned value (that may seem absurd, not even within the physical limits).

For example:

import numpy as np
import pymc3 as pm3
# create basic data such as obs = (x*0.95)**2+1.1+noise
x=np.arange(10)+1
obs=(x*0.95)**2+np.random.randn(10)+1.1
# fitting the model y=a(1*x)**2+a0 on data points
with pm3.Model() as model:
    a0 = pm3.Uniform("a0",0,5)
    a1 = pm3.Normal("a1",mu=1,sd=1)
    a2 = pm3.Deterministic('a2',(x*a1)**2+a0)
    hypothesis = pm3.Normal('hypothesis', mu=a2, sd=0.1, observed=obs)
    start = pm3.find_MAP()
print('start: ',start)

returns:

Optimization terminated successfully.
         Current function value: 570.382509
         Iterations: 13
         Function evaluations: 17
         Gradient evaluations: 17
start:  {'a1': array(0.9461006484031161), 'a0_interval_': array(-1.0812715249577414)}
like image 656
Stéphane Avatar asked Feb 09 '17 21:02

Stéphane


1 Answers

By default, pymc3 transforms some variables with bounded support to the set of real numbers. The enables a variety of operations which would otherwise choke when given bounded distributions (e.g. some methods of optimizations and sampling). When this automatic transformation is applied, the random variable that you added to the model becomes a child of the transformed variable. This transformed variable is added to the model with [var]_[transform]_ as the name.

The default transformation for a uniform random variable is called the "interval transformation" and the new name of this variable is [name]_interval_. The MAP estimate is the found by optimizing all of the parameters to maximize the posterior probability. We only need to optimize the transformed variable, since this fully determines the value of the variable you originally added to the model. pm.find_MAP() returns only the variables being optimized, not the original variables. Notice that a2 is also not returned, since it is fully determined by a0 and a1.

The code that pymc3 uses for the interval transformation [^1] is

def forward(self, x):
    a, b = self.a, self.b
    r = T.log((x - a) / (b - x))
    return r

Where a is the lower bound, b is the upper bound, and x is the variable being transformed. Using this map, values which are very close to the lower bound have transformed values approaching negative infinity, and values very close to the upper bound approach positive infinity.

Knowing the bounds, we can convert back from the real line to the bounded interval. The code that pymc3 uses for this is

def backward(self, x):
    a, b = self.a, self.b
    r = (b - a) * T.exp(x) / (1 + T.exp(x)) + a
    return r

If you apply this backwards transformation yourself, you can recover a0 itself:

(5 - 0) * exp(-1.0812715249577414) / (1 + exp(-1.0812715249577414)) + 0 = 1.26632733897

Other transformations automatically applied include the log transform (for variables bounded on one side), and the stick-breaking transform (for variables which sum to 1).

[^1] As of commit 87cdd712c86321121c2ed3150764f3d847f5083c.

like image 141
bsmith89 Avatar answered Nov 23 '22 09:11

bsmith89