How can i distribute processing of minibatch kmeans (scikit-learn)?

Question

In Scikit-learn , K-Means have n_jobs but MiniBatch K-Means is lacking it. MBK is faster than KMeans but at large sample sets we would like it distribute the processing across multiprocessing (or other parallel processing libraries).

Is MKB's Partial-fit the answer?

Andreas Mueller · Accepted Answer

I don't think this is possible. You could implement something with OpenMP inside the minibatch processing. I'm not aware of any parallel minibatch k-means procedures. Parallizing stochastic gradient descent procedures is somewhat hairy.

Btw, the n_jobs parameter in KMeans only distributes the different random initializations afaik.

How can i distribute processing of minibatch kmeans (scikit-learn)?

Tags:

python

multiprocessing

machine-learning

scikit-learn

Phyo Arkar Lwin

1 Answers

Andreas Mueller

Recent Activity

Donate For Us

How can i distribute processing of minibatch kmeans (scikit-learn)?

Tags:

python

multiprocessing

machine-learning

scikit-learn

Phyo Arkar Lwin

1 Answers

Andreas Mueller

Related questions

Recent Activity

Donate For Us