Scaling of target causes Scikit-learn SVM regression to break down

Tags:

When training a SVM regression it is usually advisable to scale the input features before training.

But how about scaling of the targets? Usually this is not considered necessary, and I do not see a good reason why it should be necessary.

However in the scikit-learn example for SVM regression from: http://scikit-learn.org/stable/auto_examples/svm/plot_svm_regression.html

By just introducing the line y=y/1000 before training, the prediction will break down to a constant value. Scaling the target variable before training would solve the problem, but I do not understand why it is necessary.

What causes this problem?

import numpy as np
from sklearn.svm import SVR
import matplotlib.pyplot as plt

# Generate sample data
X = np.sort(5 * np.random.rand(40, 1), axis=0)
y = np.sin(X).ravel()

# Add noise to targets
y[::5] += 3 * (0.5 - np.random.rand(8))

# Added line: this will make the prediction break down
y=y/1000

# Fit regression model
svr_rbf = SVR(kernel='rbf', C=1e3, gamma=0.1)
svr_lin = SVR(kernel='linear', C=1e3)
svr_poly = SVR(kernel='poly', C=1e3, degree=2)
y_rbf = svr_rbf.fit(X, y).predict(X)
y_lin = svr_lin.fit(X, y).predict(X)
y_poly = svr_poly.fit(X, y).predict(X)

# look at the results
plt.scatter(X, y, c='k', label='data')
plt.hold('on')
plt.plot(X, y_rbf, c='g', label='RBF model')
plt.plot(X, y_lin, c='r', label='Linear model')
plt.plot(X, y_poly, c='b', label='Polynomial model')
plt.xlabel('data')
plt.ylabel('target')
plt.title('Support Vector Regression')
plt.legend()
plt.show()

823

asked Aug 31 '14 15:08

user1774143

1 Answers

Support vector regression uses a loss function that is only positive if the difference between the predicted value and the target exceeds some threshold. Below the threshold, the prediction is considered "good enough" and the loss is zero. When you scale down the targets, the SVM learner can get away with returning a flat model, because it no longer incurs any loss.

The threshold parameter is called epsilon in sklearn.svm.SVR; set it to a lower value for smaller targets. The math behind this is explained here.

173

answered Sep 27 '22 17:09

Fred Foo

Related questions
                            
                                Python Subprocess Security
                            
                                Use a metaclass only for subclasses
                            
                                search criteria of IMAP protocol search command
                            
                                pip executes the wrong python library versions inside virtual env
                            
                                Check if value is between pair of values in a tuple?
                            
                                Python - extending properties like you'd extend a function
                            
                                Trailing underscore in `np.ix_`
                            
                                Cannot list all of my fields in list_editable without causing errors
                            
                                Python Regex Sub - Use Match as Dict Key in Substitution
                            
                                Python mock method call arguments display the last state of a list
                            
                                SQLAlchemy, PostgreSQL and array_agg: How to select items from array_agg?
                            
                                pandas.Series.interpolate() does nothing. Why?
                            
                                Drawing polygon with n number of sides in Python 3.2
                            
                                Timeout on tests with nosetests
                            
                                Extract text between tags with XPath including markup
                            
                                How to truncate html without breaking the tags?
                            
                                How to generate a list of antonyms for adjectives in WordNet using Python
                            
                                Using Labels in HAVING() Clause in SQLAlchemy
                            
                                Dynamic arguments for Python's argparse
                            
                                Saving DataFrame names as .csv file names in Pandas

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Scaling of target causes Scikit-learn SVM regression to break down

Tags:

python

svm

scaling

scikit-learn

regression

user1774143

People also ask

1 Answers

Fred Foo

Recent Activity

Donate For Us