I'm trying to get the observed probability density using kernel density estimation. This is how I use the kde:
from sklearn.neighbors import KernelDensity
kde = KernelDensity().fit(sample)
The problem is that, when I try to get the probability densitity of every point
kde_result = kde.score_samples(sample)
The speed is very slow. How can I speed it up?
The sample is consist of 300,000 (x,y)
points.
Just in case somebody is searching for an answert to this question, it is solved here. There it is described that you can easily speed up execution by parallelizing the computation with multiprocessing.
This bit of code will do the job (also from the same answer):
import numpy as np
import multiprocessing
from sklearn.neighbors import KernelDensity
def parrallel_score_samples(kde, samples, thread_count=int(0.875 * multiprocessing.cpu_count())):
with multiprocessing.Pool(thread_count) as p:
return np.concatenate(p.map(kde.score_samples, np.array_split(samples, thread_count)))
kde = KernelDensity(bandwidth=2.0,atol=0.0005,rtol=0.01).fit(sample)
kde_result = parrallel_score_samples(kde, sample)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With