I would like to train multiple one class SVMs in different threads. Does anybody know if scikit's SVM releases the GIL? I did not find any answers online.
Thanks
No, scikit-learn does not play any tricks with the GIL. Instead, it uses joblib for all its parallelism, which spawns multiple processes to do its work. You can achieve what you want with a custom joblib Parallel
construct.
If you intend to train multiple classifiers on the same dataset with different settings to find the optimal one, consider using the GridSearchCV
class, which handles parallelism for you.
Some sklearn Cython classes do release the GIL internally on performance critical sections, for instance the decision trees (used in random forests for instance) as of 0.15 (to be released early 2014) and the libsvm wrappers do.
This is not the general rule though. If you identify performance critical cython code in sklearn that could be changed to release the GIL please feel free to send a pull request.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With