Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scipy: Speed up kernel density estimation's score_sample method?

I'm trying to get the observed probability density using kernel density estimation. This is how I use the kde:

from sklearn.neighbors import KernelDensity
kde = KernelDensity().fit(sample)

The problem is that, when I try to get the probability densitity of every point

kde_result = kde.score_samples(sample)

The speed is very slow. How can I speed it up?

The sample is consist of 300,000 (x,y) points.

like image 466
cqcn1991 Avatar asked Nov 21 '22 05:11

cqcn1991


1 Answers

Just in case somebody is searching for an answert to this question, it is solved here. There it is described that you can easily speed up execution by parallelizing the computation with multiprocessing.

This bit of code will do the job (also from the same answer):

import numpy as np
import multiprocessing
from sklearn.neighbors import KernelDensity

def parrallel_score_samples(kde, samples, thread_count=int(0.875 * multiprocessing.cpu_count())):
    with multiprocessing.Pool(thread_count) as p:
        return np.concatenate(p.map(kde.score_samples, np.array_split(samples, thread_count)))

kde = KernelDensity(bandwidth=2.0,atol=0.0005,rtol=0.01).fit(sample) 
kde_result = parrallel_score_samples(kde, sample)
like image 99
Raphael Avatar answered Mar 07 '23 09:03

Raphael