I'm a little new with modeling techniques and I'm trying to compare SVR and Linear Regression. I've used f(x) = 5x+10 linear function to generate training and test data set. I've written following code snippet so far:
import csv
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
with open('test.csv', 'r') as f1:
train_dataframe = pd.read_csv(f1)
X_train = train_dataframe.iloc[:30,(0)]
y_train = train_dataframe.iloc[:30,(1)]
with open('test.csv','r') as f2:
test_dataframe = pd.read_csv(f2)
X_test = test_dataframe.iloc[30:,(0)]
y_test = test_dataframe.iloc[30:,(1)]
svr = svm.SVR(kernel="rbf", gamma=0.1)
log = LinearRegression()
svr.fit(X_train.reshape(-1,1),y_train)
log.fit(X_train.reshape(-1,1), y_train)
predSVR = svr.predict(X_test.reshape(-1,1))
predLog = log.predict(X_test.reshape(-1,1))
plt.plot(X_test, y_test, label='true data')
plt.plot(X_test, predSVR, 'co', label='SVR')
plt.plot(X_test, predLog, 'mo', label='LogReg')
plt.legend()
plt.show()
As you can see in the picture, Linear Regression works fine but SVM has poor prediction accuracy.
Please let me know if you any suggestion to tackle this issue.
Thanks
The reason is SVR with kernel rbf don't apply the feature scaling. You need to apply feature scaling before fitting the data to the model.
from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
X = sc_X.fit_transform(X)
sc_y = StandardScaler()
y = sc_y.fit_transform(y)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With