I'm build a model <code>clf</code> say <pre class="prettyprint"><code>clf = MultinomialNB() clf.fit(x_train, y_train) </code></pre> then I want to see my model accuracy using score <pre class="prettyprint"><code>clf.score(x_train, y_train) </code></pre> the result was 0.92 My goal is to test against the test so I use <pre class="prettyprint"><code>clf.score(x_test, y_test) </code></pre> This one I got <code>0.77</code> , so I thought it would give me the result same as this code below <pre class="prettyprint"><code>clf.fit(X_train, y_train).score(X_test, y_test) </code></pre> This I got <code>0.54</code>. Can someone help me understand why would <code>0.77 > 0.54</code> ?

You must get the same result if <code>x_train</code>, <code>y_train</code>, <code>x_test</code> and <code>y_test</code> are the same in both cases. Here is an example using iris dataset, as you can see both methods get the same result. <pre class="prettyprint"><code>>>> from sklearn.naive_bayes import MultinomialNB >>> from sklearn.cross_validation import train_test_split >>> from sklearn.datasets import load_iris >>> from copy import copy # prepare dataset >>> iris = load_iris() >>> X = iris.data[:, :2] >>> y = iris.target >>> X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2) # model >>> clf1 = MultinomialNB() >>> clf2 = MultinomialNB() >>> print id(clf1), id(clf2) # two different instances 4337289232 4337289296 >>> clf1.fit(X_train, y_train) >>> print clf1.score(X_test, y_test) 0.633333333333 >>> print clf2.fit(X_train, y_train).score(X_test, y_test) 0.633333333333 </code></pre>

scikit learn clf.fit / score model accuracy

Tags:

python

machine-learning

classification

scikit-learn

I'm build a model clf say

Click to copy

clf = MultinomialNB()
clf.fit(x_train, y_train)

then I want to see my model accuracy using score

Click to copy

clf.score(x_train, y_train)

the result was 0.92

My goal is to test against the test so I use

Click to copy

clf.score(x_test, y_test)

This one I got 0.77 , so I thought it would give me the result same as this code below

Click to copy

clf.fit(X_train, y_train).score(X_test, y_test)

This I got 0.54. Can someone help me understand why would 0.77 > 0.54 ?

844

asked Oct 16 '13 16:10

JPC

1 Answers

You must get the same result if x_train, y_train, x_test and y_test are the same in both cases. Here is an example using iris dataset, as you can see both methods get the same result.

Click to copy

>>> from sklearn.naive_bayes import MultinomialNB
>>> from sklearn.cross_validation import train_test_split
>>> from sklearn.datasets import load_iris
>>> from copy import copy
# prepare dataset
>>> iris = load_iris()
>>> X = iris.data[:, :2]
>>> y = iris.target
>>> X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# model
>>> clf1 = MultinomialNB()
>>> clf2 = MultinomialNB()
>>> print id(clf1), id(clf2) # two different instances
 4337289232 4337289296
>>> clf1.fit(X_train, y_train)
>>> print clf1.score(X_test, y_test)
 0.633333333333
>>> print clf2.fit(X_train, y_train).score(X_test, y_test)
 0.633333333333

135

answered Oct 22 '22 07:10

jabaldonedo

Related questions
                            
                                Adding a matplotlib colorbar from a PatchCollection
                            
                                Two-dimensional list wrongly assigning values in python
                            
                                Reading KML Files Using Fastkml
                            
                                Execute AquaMacs buffer that has "from __future__ import ..."
                            
                                The first argument to execute must be a string or unicode query
                            
                                How to process concurrent client requests?
                            
                                Decorating a class to monitor attribute changes
                            
                                Convert boolean index to start/end pairs for runs
                            
                                Do I need celery when I am using gevent?
                            
                                Why can't I access imported functions in Django's shell with ipython?
                            
                                Starting raw_input() with pre determined text
                            
                                mkdir permission denied
                            
                                Error for AttributeError: 'KeyedTuple' object has no attribute 'json'
                            
                                ImportError: No module named flask.ext.storage
                            
                                How do I accept piped input and then user-prompted input in a Python script?
                            
                                Python code loop speed comparisons
                            
                                python twisted multithreaded server
                            
                                What is a proper way to test SQLAlchemy code that throw IntegrityError?
                            
                                Texture coordinates near 1 behave oddly
                            
                                Many-to-one relation returns None object: SqlAlchemy

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

scikit learn clf.fit / score model accuracy

Tags:

python

machine-learning

classification

scikit-learn

JPC

People also ask

1 Answers

jabaldonedo

Recent Activity

Donate For Us