I am trying to use SciPy's gaussian_kde
function to estimate the density of multivariate data. In my code below I sample a 3D multivariate normal and fit the kernel density but I'm not sure how to evaluate my fit.
import numpy as np
from scipy import stats
mu = np.array([1, 10, 20])
sigma = np.matrix([[4, 10, 0], [10, 25, 0], [0, 0, 100]])
data = np.random.multivariate_normal(mu, sigma, 1000)
values = data.T
kernel = stats.gaussian_kde(values)
I saw this but not sure how to extend it to 3D.
Also not sure how do I even begin to evaluate the fitted density? How do I visualize this?
gaussian_kde(dataset, bw_method=None, weights=None)[source] Representation of a kernel-density estimate using Gaussian kernels. Kernel density estimation is a way to estimate the probability density function (PDF) of a random variable in a non-parametric way.
Kernel Density Estimation (KDE) It is estimated simply by adding the kernel values (K) from all Xj. With reference to the above table, KDE for whole data set is obtained by adding all row values. The sum is then normalized by dividing the number of data points, which is six in this example.
Creating a Univariate Seaborn KdeplotThe seaborn. kdeplot() function is used to plot the data against a single/univariate variable. It represents the probability distribution of the data values as the area under the plotted curve. In the above example, we have generated some random data values using the numpy.
There are several ways you might visualize the results in 3D.
The easiest is to evaluate the gaussian KDE at the points that you used to generate it, and then color the points by the density estimate.
For example:
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
mu=np.array([1,10,20])
sigma=np.matrix([[4,10,0],[10,25,0],[0,0,100]])
data=np.random.multivariate_normal(mu,sigma,1000)
values = data.T
kde = stats.gaussian_kde(values)
density = kde(values)
fig, ax = plt.subplots(subplot_kw=dict(projection='3d'))
x, y, z = values
ax.scatter(x, y, z, c=density)
plt.show()
If you had a more complex (i.e. not all lying in a plane) distribution, then you might want to evaluate the KDE on a regular 3D grid and visualize isosurfaces (3D contours) of the volume. It's easiest to use Mayavi for the visualiztion:
import numpy as np
from scipy import stats
from mayavi import mlab
mu=np.array([1,10,20])
# Let's change this so that the points won't all lie in a plane...
sigma=np.matrix([[20,10,10],
[10,25,1],
[10,1,50]])
data=np.random.multivariate_normal(mu,sigma,1000)
values = data.T
kde = stats.gaussian_kde(values)
# Create a regular 3D grid with 50 points in each dimension
xmin, ymin, zmin = data.min(axis=0)
xmax, ymax, zmax = data.max(axis=0)
xi, yi, zi = np.mgrid[xmin:xmax:50j, ymin:ymax:50j, zmin:zmax:50j]
# Evaluate the KDE on a regular grid...
coords = np.vstack([item.ravel() for item in [xi, yi, zi]])
density = kde(coords).reshape(xi.shape)
# Visualize the density estimate as isosurfaces
mlab.contour3d(xi, yi, zi, density, opacity=0.5)
mlab.axes()
mlab.show()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With