After I instantiate a scikit model (e.g. <code>LinearRegression</code>), if I call its <code>fit()</code> method multiple times (with different <code>X</code> and <code>y</code> data), what happens? Does it fit the model on the data like if I just re-instantiated the model (i.e. from scratch), or does it keep into accounts data already fitted from the previous call to <code>fit()</code>? Trying with <code>LinearRegression</code> (also looking at its source code) it seems to me that every time I call <code>fit()</code>, it fits from scratch, ignoring the result of any previous call to the same method. I wonder if this true in general, and I can rely on this behavior for all models/pipelines of scikit learn.

You can use term fit() and train() word interchangeably in machine learning. Based on classification model you have instantiated, may be a <code>clf = GBNaiveBayes()</code> or <code>clf = SVC()</code>, your model uses specified machine learning technique. And as soon as you call <code>clf.fit(features_train, label_train)</code> your model starts training using the features and labels that you have passed. you can use <code>clf.predict(features_test)</code> to predict. If you will again call <code>clf.fit(features_train2, label_train2)</code> it will start training again using passed data and will remove the previous results. Your model will reset the following inside model: <ul> <li>Weights</li> <li>Fitted Coefficients</li> <li>Bias</li> <li>And other training related stuff...</li> </ul> You can use partial_fit() method as well if you want your previous calculated stuff to stay and additionally train using next data

What does calling fit() multiple times on the same model do?

Tags:

python

machine-learning

scikit-learn

After I instantiate a scikit model (e.g. LinearRegression), if I call its fit() method multiple times (with different X and y data), what happens? Does it fit the model on the data like if I just re-instantiated the model (i.e. from scratch), or does it keep into accounts data already fitted from the previous call to fit()?

Trying with LinearRegression (also looking at its source code) it seems to me that every time I call fit(), it fits from scratch, ignoring the result of any previous call to the same method. I wonder if this true in general, and I can rely on this behavior for all models/pipelines of scikit learn.

563

asked Apr 15 '18 11:04

Fanta

2 Answers

If you will execute model.fit(X_train, y_train) for a second time - it'll overwrite all previously fitted coefficients, weights, intercept (bias), etc.

If you want to fit just a portion of your data set and then to improve your model by fitting a new data, then you can use estimators, supporting "Incremental learning" (those, that implement partial_fit() method)

165

answered Oct 08 '22 00:10

MaxU - stop WAR against UA

You can use term fit() and train() word interchangeably in machine learning. Based on classification model you have instantiated, may be a clf = GBNaiveBayes() or clf = SVC(), your model uses specified machine learning technique.
And as soon as you call clf.fit(features_train, label_train) your model starts training using the features and labels that you have passed.

you can use clf.predict(features_test) to predict.
If you will again call clf.fit(features_train2, label_train2) it will start training again using passed data and will remove the previous results. Your model will reset the following inside model:

Weights
Fitted Coefficients
Bias
And other training related stuff...

You can use partial_fit() method as well if you want your previous calculated stuff to stay and additionally train using next data

answered Oct 08 '22 02:10

sgrpwr

Related questions
                            
                                How to implement custom indentation when pretty-printing with the JSON module?
                            
                                Python Requests: Post JSON and file in single request
                            
                                Why does PyCharm use 120 Character Lines even though PEP8 Specifies 79?
                            
                                Why does python use two underscores for certain things?
                            
                                Using Sql Server with Django in production
                            
                                Argparse"ArgumentError: argument -h/--help: conflicting option string(s): -h, --help"
                            
                                How to get scalar value on a cell using conditional indexing
                            
                                How do I add python libraries to an AWS lambda function for Alexa?
                            
                                Why does Python's itertools.permutations contain duplicates? (When the original list has duplicates)
                            
                                Should I avoid converting to a string if a value is already a string?
                            
                                Celery: When should you choose Redis as a message broker over RabbitMQ?
                            
                                Python Compilation/Interpretation Process
                            
                                Subclass dict: UserDict, dict or ABC?
                            
                                What does this socket.gaierror mean?
                            
                                Pandas Resampling error: Only valid with DatetimeIndex or PeriodIndex
                            
                                Could not build wheels since package wheel is not installed
                            
                                Using 100% of all cores with the multiprocessing module
                            
                                Automatically Rescale ylim and xlim in Matplotlib
                            
                                How to retain column headers of data frame after Pre-processing in scikit-learn
                            
                                Lambdas from a list comprehension are returning a lambda when called

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With