I am asking myself various questions about the fit method in sklearn. Question 1: when I do: <pre class="prettyprint"><code>from sklearn.decomposition import TruncatedSVD model = TruncatedSVD() svd_1 = model.fit(X1) svd_2 = model.fit(X2) </code></pre> Is the content of the variable model changing whatsoever during the process? Question 2: when I do: <pre class="prettyprint"><code>from sklearn.decomposition import TruncatedSVD model = TruncatedSVD() svd_1 = model.fit(X1) svd_2 = svd_1.fit(X2) </code></pre> What is happening to svd_1? In other words, svd_1 has already been fitted and I fit it again, so what is happenning to its component?

Question 1: Is the content of the variable model changing whatsoever during the process? Yes. The <code>fit</code> method modifies the object. And it returns a reference to the object. Thus, take care! In the first example all three variables <code>model</code>, <code>svd_1</code>, and <code>svd_2</code> actually refer to the same object. <pre class="prettyprint"><code>from sklearn.decomposition import TruncatedSVD model = TruncatedSVD() svd_1 = model.fit(X1) svd_2 = model.fit(X2) print(model is svd_1 is svd_2) # prints True </code></pre> Question 2: What is happening to svd_1? <code>model</code> and <code>svd_1</code> refer to the same object, so there is absolutely no difference between the first and the second example. Final Remark: What happens in both examples is that the result of <code>fit(X1)</code> is overwritten by <code>fit(X2)</code>, as pointed out in the answer by David Maust. If you want to have two different models fitted to two different sets of data you need to do something like this: <pre class="prettyprint"><code>svd_1 = TruncatedSVD().fit(X1) svd_2 = TruncatedSVD().fit(X2) </code></pre>

fit method in python sklearn

Tags:

python

model

scikit-learn

I am asking myself various questions about the fit method in sklearn.

Question 1: when I do:

from sklearn.decomposition import TruncatedSVD
model = TruncatedSVD()
svd_1 = model.fit(X1)
svd_2 = model.fit(X2)

Is the content of the variable model changing whatsoever during the process?

Question 2: when I do:

from sklearn.decomposition import TruncatedSVD
model = TruncatedSVD()
svd_1 = model.fit(X1)
svd_2 = svd_1.fit(X2)

What is happening to svd_1? In other words, svd_1 has already been fitted and I fit it again, so what is happenning to its component?

717

asked Jan 11 '16 17:01

sweeeeeet

2 Answers

Question 1: Is the content of the variable model changing whatsoever during the process?

Yes. The fit method modifies the object. And it returns a reference to the object. Thus, take care! In the first example all three variables model, svd_1, and svd_2 actually refer to the same object.

from sklearn.decomposition import TruncatedSVD
model = TruncatedSVD()
svd_1 = model.fit(X1)
svd_2 = model.fit(X2)
print(model is svd_1 is svd_2)  # prints True

Question 2: What is happening to svd_1?

model and svd_1 refer to the same object, so there is absolutely no difference between the first and the second example.

Final Remark: What happens in both examples is that the result of fit(X1) is overwritten by fit(X2), as pointed out in the answer by David Maust. If you want to have two different models fitted to two different sets of data you need to do something like this:

svd_1 = TruncatedSVD().fit(X1)
svd_2 = TruncatedSVD().fit(X2)

114

answered Oct 06 '22 23:10

MB-F

When you call fit on TruncatedSVD. It will replace the components with those built from the new matrix. Some estimators and transformers in scikit-learn like IncrementalPCA have a partial_fit which will incrementally build a model by adding additional data.

answered Oct 06 '22 22:10

David Maust

Related questions
                            
                                selenium new tab in chrome browser by python webdriver
                            
                                Making a list of months and years from DatetimeIndex in Pandas
                            
                                Get Iframe Src content using Selenium Python
                            
                                italic symbols in matplotlib?
                            
                                Django datetime.timedelta , how does its subtract from timezone.now() if they are posssibly different sets
                            
                                Python lambda function printing <function <lambda> at 0x7fcbbc740668> instead of value
                            
                                Array ArrayList python equivalent
                            
                                Difference between cv2.NORM_L2 and cv2.NORM_L1 in opencv python
                            
                                Returning AttributeError: 'int' object has no attribute 'encode'
                            
                                Add numbers and exit with a sentinel
                            
                                Kivy run function from kv button
                            
                                Finding groups of increasing numbers in a list
                            
                                Django change database field from integer to CharField
                            
                                Why are some items not translated in Odoo?
                            
                                knnMatch does not work with K != 1
                            
                                How to clone an scikit-learn estimator including its data?
                            
                                Python cannot install PyGObject
                            
                                Python 2.7 Openpyxl UserWarning
                            
                                How to Exit Linux terminal using Python script?
                            
                                global vs. local namespace performance difference

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With