Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Enable Python to utilize all cores for fitting scikit-learn models

I'm running python 2.7 with ipython on Windows 8 64bit with a system that has 4 cores. When fitting a scikit-learn model, the CPU usage is 50%, 25% from python and 25% from Chrome.

Why is chrome using as much CPU resources as python?

Are there multithreaded version of scikit-learn model fitting functions so utilizing multicores can be as easy as setting a variable? Like...

grid_search = GridSearchCV(pipeline, parameters, n_jobs=-1)

like image 695
Nyxynyx Avatar asked May 01 '13 19:05

Nyxynyx


1 Answers

Very few sklearn models can run in parallel by them-selves. GridSearchCV with n_jobs=-1 or n_jobs=4 in a non __main__ interactive python session (e.g. in a script) [1] should be able to do multiprocessing under windows (as long as the underlying individual fit calls last more than 1s for instance).

The chrome stuff is probably unrelated: just close chrome if you don't want it to use any CPU. You probably have a tab executing some javascript or buggy flash application in the background.

[1] http://docs.python.org/2/library/multiprocessing.html#windows

like image 88
ogrisel Avatar answered Sep 27 '22 16:09

ogrisel