I've looked online and have yet to find an answer or way to figure the following
I'm translating some MATLAB code to Python where in MATLAB im looking to find the kernel density estimation with the function:
[p,x] = ksdensity(data)
where p is the probability at point x in the distribution.
Scipy has a function but only returns p.
Is there a way to find the probability at values of x?
Thanks!
Another option is the kernel density estimator in the Scikit-Learn Python package, sklearn.neighbors.KernelDensity
Here is a little example similar to the Matlab documentation for ksdensity for a Gaussian distribution:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.neighbors import KernelDensity
np.random.seed(12345)
# similar to MATLAB ksdensity example x = [randn(30,1); 5+randn(30,1)];
Vecvalues=np.concatenate((np.random.normal(0,1,30), np.random.normal(5,1,30)))[:,None]
Vecpoints=np.linspace(-8,12,100)[:,None]
kde = KernelDensity(kernel='gaussian', bandwidth=0.5).fit(Vecvalues)
logkde = kde.score_samples(Vecpoints)
plt.plot(Vecpoints,np.exp(logkde))
plt.show()
The plot this produces looks like:
That form of the ksdensity
call automatically generates an arbitrary x
. scipy.stats.gaussian_kde()
returns a callable function that can be evaluated with any x
of your choosing. The equivalent x
would be np.linspace(data.min(), data.max(), 100)
.
import numpy as np
from scipy import stats
data = ...
kde = stats.gaussian_kde(data)
x = np.linspace(data.min(), data.max(), 100)
p = kde(x)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With