Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

difference between LinearRegression and svm.SVR(kernel="linear")

First there are questions on this forum very similar to this one but trust me none matches so no duplicating please.

I have encountered two methods of linear regression using scikit's sklearn and I am failing to understand the difference between the two, especially where in first code there's a method train_test_split() called while in the other one directly fit method is called.

I am studying with multiple resources and this single issue is very confusing to me.

First which uses SVR

X = np.array(df.drop(['label'], 1))

X = preprocessing.scale(X)

y = np.array(df['label'])

X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size=0.2)

clf = svm.SVR(kernel='linear')

clf.fit(X_train, y_train)

confidence = clf.score(X_test, y_test)

And second is this one

# Split the data into training/testing sets
diabetes_X_train = diabetes_X[:-20]
diabetes_X_test = diabetes_X[-20:]

# Split the targets into training/testing sets
diabetes_y_train = diabetes.target[:-20]
diabetes_y_test = diabetes.target[-20:]

# Create linear regression object
regr = linear_model.LinearRegression()

# Train the model using the training sets
regr.fit(diabetes_X_train, diabetes_y_train)

# Make predictions using the testing set
diabetes_y_pred = regr.predict(diabetes_X_test)

So my main focus is the difference between using svr(kernel="linear") and using LinearRegression()

like image 817
Dev_Man Avatar asked Oct 27 '17 08:10

Dev_Man


People also ask

What is difference between SVM and linear regression?

To sum up: Linear Regression has explicit decision and SVM finds approximate of real decision because of numerical(computational) solution.

Is kernel SVM non linear?

Nonlinear classification: SVM can be extended to solve nonlinear classification tasks when the set of samples cannot be separated linearly. By applying kernel functions, the samples are mapped onto a high-dimensional feature space, in which the linear classification is possible.

What is the difference between SVR and SVM?

Those who are in Machine Learning or Data Science are quite familiar with the term SVM or Support Vector Machine. But SVR is a bit different from SVM. As the name suggest the SVR is an regression algorithm , so we can use SVR for working with continuous Values instead of Classification which is SVM.

Is SVM with linear kernel a linear model?

SVM or Support Vector Machine is a linear model for classification and regression problems. It can solve linear and non-linear problems and work well for many practical problems.


1 Answers

cross_validation.train_test_split : Splits arrays or matrices into random train and test subsets.

In second code, splitting is not random.

svm.SVR: The Support Vector Regression (SVR) uses the same principles as the SVM for classification, with only a few minor differences. First of all, because output is a real number it becomes very difficult to predict the information at hand, which has infinite possibilities. In the case of regression, a margin of tolerance (epsilon) is set in approximation to the SVM which would have already requested from the problem. But besides this fact, there is also a more complicated reason, the algorithm is more complicated therefore to be taken in consideration. However, the main idea is always the same: to minimize error, individualizing the hyperplane which maximizes the margin, keeping in mind that part of the error is tolerated.

Linear Regression: In statistics, linear regression is a linear approach for modeling the relationship between a scalar dependent variable y and one or more explanatory variables (or independent variables) denoted X. The case of one explanatory variable is called simple linear regression.

Reference: https://cs.adelaide.edu.au/~chhshen/teaching/ML_SVR.pdf

like image 176
Tushar Gupta Avatar answered Sep 24 '22 01:09

Tushar Gupta