Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to update Logistic Regression Model?

I have trained a logistic regression model. Now I have to update(partial fit) the model with new set of training data. Is it possible ?

like image 697
kathir raja Avatar asked Jan 28 '23 16:01

kathir raja


1 Answers

You cannot use partial_fit on LogisticRegression.

But you can:

  • use warm_start=True, which reuse the solution of the previous call to fit as initialization, to speed up convergence.
  • use SGDClassifier with loss='log' which is equivalent to LogisticRegression, and which supports partial_fit.

Note the difference between partial_fit and warm_start. Both methods starts from the previous model and update it, but partial_fit updates only slightly the model, while warm_start goes all the way to convergence on the new training data, forgetting the previous model. warm_start is only used to speed up convergence.

See also the glossary:

warm_start

When fitting an estimator repeatedly on the same dataset, but for multiple parameter values (such as to find the value maximizing performance as in grid search), it may be possible to reuse aspects of the model learnt from the previous parameter value, saving time. When warm_start is true, the existing fitted model attributes an are used to initialise the new model in a subsequent call to fit.

Note that this is only applicable for some models and some parameters, and even some orders of parameter values. For example, warm_start may be used when building random forests to add more trees to the forest (increasing n_estimators) but not to reduce their number.

partial_fit also retains the model between calls, but differs: with warm_start the parameters change and the data is (more-or-less) constant across calls to fit; with partial_fit, the mini-batch of data changes and model parameters stay fixed.

There are cases where you want to use warm_start to fit on different, but closely related data. For example, one may initially fit to a subset of the data, then fine-tune the parameter search on the full dataset. For classification, all data in a sequence of warm_start calls to fit must include samples from each class.

__

partial_fit

Facilitates fitting an estimator in an online fashion. Unlike fit, repeatedly calling partial_fit does not clear the model, but updates it with respect to the data provided. The portion of data provided to partial_fit may be called a mini-batch. Each mini-batch must be of consistent shape, etc.

partial_fit may also be used for out-of-core learning, although usually limited to the case where learning can be performed online, i.e. the model is usable after each partial_fit and there is no separate processing needed to finalize the model. cluster.Birch introduces the convention that calling partial_fit(X) will produce a model that is not finalized, but the model can be finalized by calling partial_fit() i.e. without passing a further mini-batch.

Generally, estimator parameters should not be modified between calls to partial_fit, although partial_fit should validate them as well as the new mini-batch of data. In contrast, warm_start is used to repeatedly fit the same estimator with the same data but varying parameters.

like image 123
TomDLT Avatar answered Feb 02 '23 08:02

TomDLT