```
from IPython.display import YouTubeVideo
'Vv3f0QNWvWQ') YouTubeVideo(
```

tl;dr: I hacked the `emcee`

–The MCMC-Hammer ensemble sampler to work on `PyMC`

models.

## Motivation

`PyMC`

is an awesome Python module to perform Bayesian inference. It allows for flexible model creation and has basic MCMC samplers like Metropolis-Hastings. The upcoming PyMC3 will feature much fancier samplers like Hamiltonian-Monte Carlo (HMC) that are all the rave these days. While those HMC samplers are very powerful for complex models with lots of dimensions, they also require gradient information of the posterior. AutoDiff (as part of `Theano`

) makes this very easy for exponential family models.

However, in the real world of Big Science, we often do not have likelihoods from the exponential family. For example, the likelihood function of HDDM – our toolbox to do hierarchical Bayesian estimation of a decision making model often used in mathematical psychology – has an infinite sum and requires numerical integration which make it non-differentiable. In many other areas of science, likelihoods result from complex non-linear models that take a while to compute as described in this recent post on Andrew Gelman’s blog.

Largely separated from the hype around HMC that machine learners and statisticians love, astrophysicists have been solving this problems in other ways. Ensemble Samplers with Affine Invariance do not require gradient information but instead run many samplers (called “walkers” – Walking Dead, anyone?) in parallel to cover the space. With expensive likelihoods, having parallel sampling (something that’s not possible with HMC yet) is a game changer.

For a fun visual depiction of HMC vs Ensemble Samplers, see these two videos (make sure to crank your speakers to the max and dance as if the NSA wasn’t watching):

### Hamiltonian Monte Carlo sampler

### Affine Invariant Ensemble sampler

`'yow7Ol88DRk') YouTubeVideo(`

While `PyMC`

does not have support for these Ensemble Samplers, there is `emcee`

– The MCMC Hammer which implements it. `emcee`

however only has the sampler, it currently lacks the model specification capabilities and probability distributions that `PyMC`

offers. It is thus an obvious idea to try and combine the emcee sampler with the `PyMC`

functionality for model building. This here is a very dirty hack that does just that.

```
import numpy as np
import emcee
import pymc as pm
```

```
/usr/local/lib/python2.7/dist-packages/scikits/__init__.py:1: UserWarning: Module mpl_toolkits was already imported from None, but /usr/local/lib/python2.7/dist-packages is being added to sys.path
__import__('pkg_resources').declare_namespace(__name__)
```

This is copy pasted from the bioessay_gelman.py example of `PyMC`

.

```
from pymc import *
from numpy import ones, array
# Samples for each dose level
= 5 * ones(4, dtype=int)
n # Log-dose
= array([-.86, -.3, -.05, .73])
dose
# Logit-linear model parameters
= Normal('alpha', 0, 0.01)
alpha = Normal('beta', 0, 0.01)
beta
# Calculate probabilities of death
= Lambda('theta', lambda a=alpha, b=beta, d=dose: invlogit(a + b * d), plot=False)
theta
# Data likelihood
= Binomial(
deaths 'deaths',
=n,
n=theta,
p=array([0,
value1,
3,
5],
=float),
dtype=True)
observed
# Calculate LD50
= Lambda('LD50', lambda a=alpha, b=beta: -a / b) LD50
```

Build and sample from the model the PyMC way to see that the outcome between the Metropolis-Hastings sampler and emcee are the same.

```
= pm.MAP([alpha, beta, theta, deaths])
m
m.fit()= pm.MCMC(m)
m 5000, burn=1000)
m.sample( pm.Matplot.plot(m)
```

```
[****************100%******************] 5000 of 5000 completePlotting alpha
Plotting beta
```

## The unholy union

OK, you have been warned.

The central trick is to use the PyMC model to compute the logp of the model. We set the parameters externally and call `.logp`

which we pass to the emcee sampler.

```
# Build model
= pm.Model([alpha, beta, theta, deaths])
m
# This is the likelihood function for emcee
def lnprob(vals): # vals is a vector of parameter values to try
# Set each random variable of the pymc model to the value
# suggested by emcee
for val, var in zip(vals, m.stochastics):
= val
var.value
# Calculate logp
return m.logp
# emcee parameters
= len(m.stochastics)
ndim = 500
nwalkers
# Find MAP
pm.MAP(m).fit()= np.empty(ndim)
start for i, var in enumerate(m.stochastics):
= var.value
start[i]
# sample starting points for walkers around the MAP
= np.random.randn(ndim * nwalkers).reshape((nwalkers, ndim)) + start
p0
# instantiate sampler passing in the pymc likelihood function
= emcee.EnsembleSampler(nwalkers, ndim, lnprob)
sampler
# burn-in
= sampler.run_mcmc(p0, 10)
pos, prob, state
sampler.reset()
# sample 10 * 500 = 5000
10)
sampler.run_mcmc(pos,
# Save samples back to pymc model
= pm.MCMC(m)
m 1) # This call is to set up the chains
m.sample(for i, var in enumerate(m.stochastics):
0] = sampler.flatchain[:, i]
var.trace._trace[
pm.Matplot.plot(m)
```

```
[ 0% ]Plotting alpha
Plotting beta
```

We get the same posterior back as with `PyMC`

which is encouraging. The autocorrelation is also lower (with a high number of walkers, looks worse with fewer walkers) although I expected it to be even lower as this is a big selling point for `emcee`

.

## HMC vs Affine Invariant Ensemble samplers

I think HMC and Ensemble samplers are both relevant, albeit in different domains. Without having done any experiments, I expect HMC to do well for exponential family models with high dimensions and complex posteriors. Ensemble samplers can deal with non-differentiable, expensive to compute likelihood functions as they often appear in science. They can not escape the curse of dimensionality, however, as the size of the space grows exponentially but adding walkers only scales linear.

## Future directions

`emcee`

could be included into`PyMC`

as a custom step method which would reduce the hackiness.- Parallelization: For me the biggest selling point for Ensemble Samplers is that they can be parallelized quite easily (something not possible for HMC). This helps especially with expensive likelihoods that can require simulation to evaluate.
`emcee`

already has support for this but the`lnprob`

function must be pickable and`PyMC`

models are not pickable. It is possible, however, to recreate a`PyMC`

model in each process. This would be quite easy with this hack but doesn’t seem as straight forward if`emcee`

were integrated in`PyMC`

.