Train multiple models in parallel with sklearn?

Tags:

I want to train multiple LinearSVC models with different random states but I prefer to do it in parallel. Is there an mechanism supporting this in sklearn? I know Gridsearch or some ensemble methods are doing in implicitly but what is the thing under the hood?

353

asked Apr 12 '15 12:04

erogol

1 Answers

The "thing" under the hood is the library joblib, which powers for example the multi-processing in GridSearchCV and some ensemble methods. It's Parallel helper class is a very handy Swiss knife for embarrassingly parallel for loops.

This is an example to train multiple LinearSVC models with different random states in parallel with 4 processes using joblib:

from joblib import Parallel, delayed
from sklearn.svm import LinearSVC
import numpy as np

def train_model(X, y, seed):
    model = LinearSVC(random_state=seed)
    return model.fit(X, y)

X = np.array([[1,2,3],[4,5,6]])
y = np.array([0, 1])
result = Parallel(n_jobs=4)(delayed(train_model)(X, y, seed) for seed in range(10))
# result is a list of 10 models trained using different seeds

194

answered Dec 28 '22 14:12

YS-L

Related questions
                            
                                Bayesian networks in Scala [closed]
                            
                                Clustering given pairwise distances with unknown cluster number?
                            
                                Scikit-Learn: Label not x is present in all training examples
                            
                                Manually changing learning_rate in tf.train.AdamOptimizer
                            
                                Keras give input to intermediate layer and get final output
                            
                                SciKit-Learn Label Encoder resulting in error 'argument must be a string or number'
                            
                                MATLAB: Self-Organizing Map (SOM) clustering
                            
                                How to test tensorflow cifar10 cnn tutorial model
                            
                                Which layers should I freeze for fine tuning a resnet model on keras?
                            
                                How can I download and skip VGG weights that have no counterpart with my CNN in Keras?
                            
                                Faster kNN algorithm in Python
                            
                                What does the KNN algorithm do in the training phase?
                            
                                How to add dummies to Pandas DataFrame?
                            
                                Pytorch: RuntimeError: reduce failed to synchronize: cudaErrorAssert: device-side assert triggered
                            
                                Annotate images in pascal voc xml [closed]
                            
                                Tensorflow GPU utilization only 60% (GTX 1070)
                            
                                Neural Network to predict nth square
                            
                                Hyperparameter tune for Tensorflow
                            
                                TD-IDF Find Cosine Similarity Between New Document and Dataset
                            
                                How to calculate bits per character of a string? (bpc)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Train multiple models in parallel with sklearn?

Tags:

machine-learning

python-multiprocessing

scikit-learn

erogol

People also ask

1 Answers

YS-L

Recent Activity

Donate For Us