Scipy: Speed up kernel density estimation's score_sample method?

Question

I'm trying to get the observed probability density using kernel density estimation. This is how I use the kde:

from sklearn.neighbors import KernelDensity
kde = KernelDensity().fit(sample)

The problem is that, when I try to get the probability densitity of every point

kde_result = kde.score_samples(sample)

The speed is very slow. How can I speed it up?

The sample is consist of 300,000 (x,y) points.

Raphael · Accepted Answer

Just in case somebody is searching for an answert to this question, it is solved here. There it is described that you can easily speed up execution by parallelizing the computation with multiprocessing.

This bit of code will do the job (also from the same answer):

import numpy as np
import multiprocessing
from sklearn.neighbors import KernelDensity

def parrallel_score_samples(kde, samples, thread_count=int(0.875 * multiprocessing.cpu_count())):
    with multiprocessing.Pool(thread_count) as p:
        return np.concatenate(p.map(kde.score_samples, np.array_split(samples, thread_count)))

kde = KernelDensity(bandwidth=2.0,atol=0.0005,rtol=0.01).fit(sample) 
kde_result = parrallel_score_samples(kde, sample)

Scipy: Speed up kernel density estimation's score_sample method?

Tags:

python

statistics

scipy

scikit-learn

cqcn1991

1 Answers

Raphael

Recent Activity

Donate For Us

Scipy: Speed up kernel density estimation's score_sample method?

Tags:

python

statistics

scipy

scikit-learn

cqcn1991

1 Answers

Raphael

Related questions

Recent Activity

Donate For Us