When I use the following code with Data matrix <code>X</code> of size (952,144) and output vector <code>y</code> of size (952), <code>mean_squared_error</code> metric returns negative values, which is unexpected. Do you have any idea? <pre class="prettyprint"><code>from sklearn.svm import SVR from sklearn import cross_validation as CV reg = SVR(C=1., epsilon=0.1, kernel='rbf') scores = CV.cross_val_score(reg, X, y, cv=10, scoring='mean_squared_error') </code></pre> all values in <code>scores</code> are then negative.

To see what are available scoring keys use: <pre class="prettyprint"><code>import sklearn print(sklearn.metrics.SCORERS.keys()) </code></pre> You can either use <code>'r2' or 'neg_mean_squared_error'</code>. There are lots of options based on your requirement.

scikit-learn cross validation, negative values with mean squared error

Tags:

python

scikit-learn

regression

cross-validation

When I use the following code with Data matrix X of size (952,144) and output vector y of size (952), mean_squared_error metric returns negative values, which is unexpected. Do you have any idea?

from sklearn.svm import SVR
from sklearn import cross_validation as CV

reg = SVR(C=1., epsilon=0.1, kernel='rbf')
scores = CV.cross_val_score(reg, X, y, cv=10, scoring='mean_squared_error')

all values in scores are then negative.

416

asked Jan 29 '14 22:01

ahmethungari

3 Answers

Trying to close this out, so am providing the answer that David and larsmans have eloquently described in the comments section:

Yes, this is supposed to happen. The actual MSE is simply the positive version of the number you're getting.

The unified scoring API always maximizes the score, so scores which need to be minimized are negated in order for the unified scoring API to work correctly. The score that is returned is therefore negated when it is a score that should be minimized and left positive if it is a score that should be maximized.

This is also described in sklearn GridSearchCV with Pipeline.

105

answered Sep 28 '22 19:09

AN6U5

You can fix it by changing scoring method to "neg_mean_squared_error" as you can see below:

from sklearn.svm import SVR
from sklearn import cross_validation as CV

reg = SVR(C=1., epsilon=0.1, kernel='rbf')
scores = CV.cross_val_score(reg, X, y, cv=10, scoring='neg_mean_squared_error')

answered Sep 28 '22 19:09

Otacílio Maia

To see what are available scoring keys use:

import sklearn
print(sklearn.metrics.SCORERS.keys())

You can either use 'r2' or 'neg_mean_squared_error'. There are lots of options based on your requirement.

answered Sep 29 '22 19:09

MD Rijwan

Related questions
                            
                                python save plotly plot to local file and insert into html
                            
                                Import psycopg2 Library not loaded: libssl.1.0.0.dylib
                            
                                Map list item to function with arguments
                            
                                Iterating over a 2 dimensional python list [duplicate]
                            
                                How to easily distribute Python software that has Python module dependencies? Frustrations in Python package installation on Unix
                            
                                Python function argument list formatting
                            
                                How do I correctly install dulwich to get hg-git working on Windows?
                            
                                Should I use `random.seed` or `numpy.random.seed` to control random number generation in `scikit-learn`?
                            
                                Can I get a reference to a Python property?
                            
                                Store different datatypes in one NumPy array?
                            
                                Releasing memory of huge numpy array in IPython
                            
                                How should I stop a busy cell in an iPython notebook?
                            
                                How to properly use coverage.py in Python?
                            
                                \text does not work in a matplotlib label
                            
                                Get the column names of a python numpy ndarray
                            
                                Are Python built-in containers thread-safe?
                            
                                TypeError: unhashable type: 'list' when using built-in set function
                            
                                Python debugger: Stepping into a function that you have called interactively
                            
                                Python Pandas: Is Order Preserved When Using groupby() and agg()?
                            
                                selecting attribute values from lxml

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With