Vectorized implementation for `numpy.random.multivariate_normal`

Question

I am trying to use numpy.random.multivariate_normal to generate multiple samples where each sample is drawn from a multivariate Normal distribution with a different mean and cov. For example, if I would like to draw 2 samples, I tried

from numpy import random as rand

means = np.array([[-1., 0.], [1., 0.]])
covs = np.array([np.identity(2) for k in xrange(2)]) 
rand.multivariate_normal(means, covs)

but this results in ValueError: mean must be 1 dimensional. Do I have to do a for loop for this? I thought that for functions like rand.binomial this is possible.

Warren Weckesser · Accepted Answer

As @hpaulj suggested, you can generate samples from the standard multivariate normal distribution, and then use, say, einsum and/or broadcasting to transform the samples. The scaling is done by multiplying the standard sample points by the square root of the covariance matrix. In the following, I use scipy.linalg.sqrtm to compute the matrix square root, and numpy.einsum to do the matrix multiplication.

import numpy as np
from scipy.linalg import sqrtm
import matplotlib.pyplot as plt


# Sequence of means
means = np.array([[-15., 0.], [15., 0.], [0., 0.]])
# Sequence of covariance matrices.  Must be the same length as means.
covs = np.array([[[ 3, -1],
                  [-1,  2]],
                 [[ 1,  2],
                  [ 2,  5]],
                 [[ 1,  0],
                  [ 0,  1]]])
# Number of samples to generate for each (mean, cov) pair.
nsamples = 4000

# Compute the matrix square root of each covariance matrix.
sqrtcovs = np.array([sqrtm(c) for c in covs])

# Generate samples from the standard multivariate normal distribution.
dim = len(means[0])
u = np.random.multivariate_normal(np.zeros(dim), np.eye(dim),
                                  size=(len(means), nsamples,))
# u has shape (len(means), nsamples, dim)

# Transform u.
v = np.einsum('ijk,ikl->ijl', u, sqrtcovs)
m = np.expand_dims(means, 1)
t = v + m

# t also has shape (len(means), nsamples, dim).
# t[i] holds the nsamples sampled from the distribution with mean means[i]
# and covariance cov[i].

plt.subplot(2, 1, 1)
plt.plot(t[...,0].ravel(), t[...,1].ravel(), '.', alpha=0.02)
plt.axis('equal')
plt.xlim(-25, 25)
plt.ylim(-8, 8)
plt.grid()

# Make another plot, where we generate the samples by passing the given
# means and covs to np.random.multivariate_normal.  This plot should look
# the same as the first plot.
plt.subplot(2, 1, 2)
p0 = np.random.multivariate_normal(means[0], covs[0], size=nsamples)
p1 = np.random.multivariate_normal(means[1], covs[1], size=nsamples)
p2 = np.random.multivariate_normal(means[2], covs[2], size=nsamples)

plt.plot(p0[:,0], p0[:,1], 'b.', alpha=0.02)
plt.plot(p1[:,0], p1[:,1], 'g.', alpha=0.02)
plt.plot(p2[:,0], p2[:,1], 'r.', alpha=0.02)
plt.axis('equal')
plt.xlim(-25, 25)
plt.ylim(-8, 8)
plt.grid()

plot

This method might not be any faster that looping over the means and covs arrays and calling multivariate_normal once for each pair (mean, cov). The case where this method would give the most benefit is when you have many different means and covariances and are generating a small number of samples per pair. And even then, it might not be faster, because the script uses a Python loop over the covs array to call sqrtm for each covariance matrix. If performance is critical, test with your actual data.

Vectorized implementation for `numpy.random.multivariate_normal`

Tags:

python

numpy

p-value

1 Answers

Warren Weckesser

Recent Activity

Donate For Us

Vectorized implementation for `numpy.random.multivariate_normal`

Tags:

python

numpy

p-value

1 Answers

Warren Weckesser

Related questions

Recent Activity

Donate For Us