Using scipy.stats.gaussian_kde with 2 dimensional data

Tags:

I'm trying to use the scipy.stats.gaussian_kde class to smooth out some discrete data collected with latitude and longitude information, so it shows up as somewhat similar to a contour map in the end, where the high densities are the peak and low densities are the valley.

I'm having a hard time putting a two-dimensional dataset into the gaussian_kde class. I've played around to figure out how it works with 1 dimensional data, so I thought 2 dimensional would be something along the lines of:

Click to copy

from scipy import stats
from numpy import array
data = array([[1.1, 1.1],
              [1.2, 1.2],
              [1.3, 1.3]])
kde = stats.gaussian_kde(data)
kde.evaluate([1,2,3],[1,2,3])

which is saying that I have 3 points at [1.1, 1.1], [1.2, 1.2], [1.3, 1.3]. and I want to have the kernel density estimation using from 1 to 3 using width of 1 on x and y axis.

When creating the gaussian_kde, it keeps giving me this error:

Click to copy

raise LinAlgError("singular matrix")
numpy.linalg.linalg.LinAlgError: singular matrix

Looking into the source code of gaussian_kde, I realize that the way I'm thinking about what dataset means is completely different from how the dimensionality is calculate, but I could not find any sample code showing how multi-dimension data works with the module. Could someone help me with some sample ways to use gaussian_kde with multi-dimensional data?

589

asked Nov 08 '10 22:11

jet

1 Answers

This example seems to be what you're looking for:

Click to copy

import numpy as np
import scipy.stats as stats
from matplotlib.pyplot import imshow

# Create some dummy data
rvs = np.append(stats.norm.rvs(loc=2,scale=1,size=(2000,1)),
                stats.norm.rvs(loc=0,scale=3,size=(2000,1)),
                axis=1)

kde = stats.kde.gaussian_kde(rvs.T)

# Regular grid to evaluate kde upon
x_flat = np.r_[rvs[:,0].min():rvs[:,0].max():128j]
y_flat = np.r_[rvs[:,1].min():rvs[:,1].max():128j]
x,y = np.meshgrid(x_flat,y_flat)
grid_coords = np.append(x.reshape(-1,1),y.reshape(-1,1),axis=1)

z = kde(grid_coords.T)
z = z.reshape(128,128)

imshow(z,aspect=x_flat.ptp()/y_flat.ptp())

enter image description here

Axes need fixing, obviously.

You can also do a scatter plot of the data with

Click to copy

scatter(rvs[:,0],rvs[:,1])

enter image description here

160

answered Nov 06 '22 08:11

endolith

Related questions
                            
                                Wrong result from scipy.optimize.minimize used on Fortran function using ctypes
                            
                                Trying to understand scipy and numpy interpolation
                            
                                How to convert a scipy csr_matrix back into lists of row, col and data?
                            
                                How to get the p-value between two groups after groupby in pandas?
                            
                                Elementwise maximum of sparse Scipy matrix & vector with broadcasting
                            
                                Efficient way to create a diagonal sparse matrix
                            
                                Add more sample points to data
                            
                                How to efficiently add sparse matrices in Python
                            
                                Dictionary-like efficient storing of scipy/numpy arrays
                            
                                SciPy global minimum curve fit
                            
                                numpy array to scipy.sparse matrix
                            
                                using scipy in django with apache and mod_wsgi
                            
                                How does kmeans know how to cluster documents when we only feed it tfidf vectors of individual words?
                            
                                How Do I put 2 matrix into scipy.optimize.minimize?
                            
                                How to build a chi-square distribution table
                            
                                Building a covariance matrix in Python
                            
                                How to display a graph in ipython notebook
                            
                                Python, Pandas & Chi-Squared Test of Independence
                            
                                Import error: No module named 'scipy._lib'
                            
                                How to find skewness and kurtosis correctly in pandas?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Using scipy.stats.gaussian_kde with 2 dimensional data

Tags:

multidimensional-array

scipy

kernel-density

jet

People also ask

1 Answers

endolith

Recent Activity

Donate For Us