Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is a chain in PyMC3?

I am learning PyMC3 for Bayesian modeling. You can create a model and sample with:

import pandas as pd
import pymc3 as pm

# obs is a DataFrame with a single column, containing
# the observed values for variable height
obs = pd.DataFrame(...)

# we create a pymc3 model
with pm.Model() as m:
    mu = pm.Normal('mu', mu=178, sd=20)
    sigma = pm.Uniform('sigma', lower=0, upper=50)
    height = pm.Normal('height', mu=mu, sd=sigma, observed=obs)
    trace = pm.sample(1000, tune=1000)

pm.traceplot(trace)

pymc3 output

When I check the trace (in this case 1000 samples from the posterior probability), I notice that 2 chains are created:

>>> trace.nchains
2

I read the tutorial on PyMC3 and looked through the API but it is unclear to me what a chain represents (in this case I asked for 1000 samples from the posterior but I got 2 chains, each one with 1000 samples from the posterior).

Are the chains different runs of the sampler with the same parameters or do they have some other meaning/purpose?

like image 963
gc5 Avatar asked Apr 13 '18 21:04

gc5


People also ask

How many chains PyMC3?

By default, PyMC3 will run one chain for each core available. This used 4 cores to sample 4 chains, and did it in less than a second.

How does PyMC3 work?

PyMC3 is a new open source Probabilistic Programming framework written in Python that uses Theano to compute gradients via automatic differentiation as well as compile probabilistic programs on-the-fly to C for increased speed.

What does PyMC3 stand for?

PyMC (formerly known as PyMC3) is a Python package for Bayesian statistical modeling and probabilistic machine learning which focuses on advanced Markov chain Monte Carlo and variational fitting algorithms.

Why use PyMC3?

PyMC3 allows you to solve data science problems of varying complexity that you can hardly solve any other way. The tool is highly versatile and is being used successfully by various companies. For example, SpaceX used PyMC3 to optimize its supply chains (as explained in this blog post).


1 Answers

A chain is a single run of MCMC. So if you have six 2-d parameters in your model and ask for 1000 samples, you will get six 2x1000 arrays for each chain.

When running MCMC, it is a best practice to use multiple chains, as they can help diagnose problems. For example, the Gelman-Rubin diagnostic requires multiple chains, and runs automatically (using joblib, which tries to use multiple cores if possible) if you use more than 1 chain in PyMC3.

As a concrete example of when you might want multiple chains, consider sampling from a multimodal distribution. Even the NUTS sampler may not visit both modes in a single chain, but you could diagnose this using multiple chains.

multiple chains

Note that PyMC3 usually combines the chains when you work with them (e.g., using trace.get_values('my_var')), since they are all valid MCMC samples. This does lead to some confusing behavior in that asking for 1000 samples actually gets you 4000 on most systems, where you get 4 chains by default.

like image 145
colcarroll Avatar answered Oct 20 '22 16:10

colcarroll