Sklearn SVM: SVR and SVC, getting the same prediction for every input

Tags:

Here is a paste of the code: SVM sample code

I checked out a couple of the other answers to this problem...and it seems like this specific iteration of the problem is a bit different.

First off, my inputs are normalized, and I have five inputs per point. The values are all reasonably sized (healthy 0.5s and 0.7s etc--few near zero or near 1 numbers).

I have about 70 x inputs corresponding to their 70 y inputs. The y inputs are also normalized (they are percentage changes of my function after each time-step).

I initialize my SVR (and SVC), train them, and then test them with 30 out-of-sample inputs...and get the exact same prediction for every input (and the inputs are changing by reasonable amounts--0.3, 0.6, 0.5, etc.). I would think that the classifier (at least) would have some differentiation...

Here is the code I've got:

# train svr

my_svr = svm.SVR()
my_svr.fit(x_training,y_trainr)

# train svc

my_svc = svm.SVC()
my_svc.fit(x_training,y_trainc)


# predict regression

p_regression = my_svr.predict(x_test)
p_r_series = pd.Series(index=y_testing.index,data=p_regression)

# predict classification

p_classification = my_svc.predict(x_test)
p_c_series = pd.Series(index=y_testing_classification.index,data=p_classification)

And here are samples of my inputs:

x_training = [[  1.52068627e-04   8.66880301e-01   5.08504362e-01   9.48082047e-01
7.01156322e-01],
              [  6.68130520e-01   9.07506250e-01   5.07182647e-01   8.11290634e-01
6.67756208e-01],
              ... x 70 ]

y_trainr = [-0.00723209 -0.01788079  0.00741741 -0.00200805 -0.00737761  0.00202704 ...]

y_trainc = [ 0.  0.  1.  0.  0.  1.  1.  0. ...]

And the x_test matrix (5x30) is similar to the x_training matrix in terms of magnitudes and variance of inputs...same for y_testr and y_testc.

Currently, the predictions for all of the tests are exactly the same (0.00596 for the regression, and 1 for the classification...)

How do I get the SVR and SVC functions to spit out relevant predictions? Or at least different predictions based on the inputs...

At the very least, the classifier should be able to make choices. I mean, even if I haven't provided enough dimensions for regression...

927

asked Dec 26 '15 21:12

Chris

2 Answers

Try increasing your C from the default. It seems you are underfitting.

my_svc = svm.SVC(probability=True, C=1000)
my_svc.fit(x_training,y_trainc)

p_classification = my_svc.predict(x_test)

p_classification then becomes:

array([ 1.,  0.,  1.,  0.,  1.,  1.,  1.,  1.,  1.,  1.,  0.,  0.,  0.,
        1.,  0.,  0.,  0.,  0.,  0.,  1.,  1.,  0.,  1.,  1.,  1.,  1.,
        1.,  1.,  1.,  1.])

For the SVR case you will also want to reduce your epsilon.

my_svr = svm.SVR(C=1000, epsilon=0.0001)
my_svr.fit(x_training,y_trainr)

p_regression = my_svr.predict(x_test)

p_regression then becomes:

array([-0.00430622,  0.00022762,  0.00595002, -0.02037147, -0.0003767 ,
        0.00212401,  0.00018503, -0.00245148, -0.00109994, -0.00728342,
       -0.00603862, -0.00321413, -0.00922082, -0.00129351,  0.00086844,
        0.00380351, -0.0209799 ,  0.00495681,  0.0070937 ,  0.00525708,
       -0.00777854,  0.00346639,  0.0070703 , -0.00082952,  0.00246366,
        0.03007465,  0.01172834,  0.0135077 ,  0.00883518,  0.00399232])

You should look to tune your C parameter using cross validation so that it is able to perform best on whichever metric matters most to you. You may want to look at GridSearchCV to help you do this.

170

answered Oct 18 '22 11:10

David Maust

I had the same issue, but a completely different cause, and therefore a completely different place to look for a solution.

If your prediction inputs are scaled incorrectly for any reason, you can experience the same symptoms found here. This could be forgetting (or miscoding) the scaling of input values in a later prediction, or due to the inputs being in the wrong order.

answered Oct 18 '22 13:10

James Nowell

Related questions
                            
                                How can I log from Python to syslog with either SysLogHandler or syslog on Mac OS X *and* Debian (7)
                            
                                Is there a Counter in Python that can accumulate negative values? [duplicate]
                            
                                Testing Flask web app with unittest POST Error 500
                            
                                Peewee insert if not exist
                            
                                Memory growth with broadcast operations in NumPy
                            
                                is there a way to save bokeh data table content
                            
                                Python Matplotlib Multi-color Legend Entry
                            
                                Create dynamic arguments for url_for in Flask
                            
                                naming convention: What does the 'm' mean in libpython3.5m.dylib
                            
                                Create child processes inside a child process with Python multiprocessing failed
                            
                                rabbitmq multiple consumers on a queue- only one get the message
                            
                                What is the advantage of flask.logger over the more generic python logging module?
                            
                                read HDF5 file to pandas DataFrame with conditions
                            
                                How to make 'pip install' not uninstall other versions?
                            
                                Kivy properly set own icon
                            
                                What type signature do generators have in Python?
                            
                                Find substrings in PyMongo
                            
                                PyQt4: How to pause a Thread until a signal is emitted?
                            
                                Python BigQuery allowLargeResults with pandas.io.gbq
                            
                                'Unexpected Keyword Argument' in super().__init__()

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Sklearn SVM: SVR and SVC, getting the same prediction for every input

Tags:

python

scikit-learn

sklearn-pandas

Chris

People also ask

2 Answers

David Maust

James Nowell

Recent Activity

Donate For Us