Python

Question

I have a following code using linear_model.Lasso:

X_train, X_test, y_train, y_test = cross_validation.train_test_split(X,y,test_size=0.2)
clf = linear_model.Lasso()
clf.fit(X_train,y_train)
accuracy = clf.score(X_test,y_test)
print(accuracy)

I want to perform k fold (10 times to be specific) cross_validation. What would be the right code to do that?

Espoir Murhabazi · Accepted Answer

here is the code I use to perform cross validation on a linear regression model and also to get the details:

from sklearn.model_selection import cross_val_score
scores = cross_val_score(clf, X_Train, Y_Train, scoring="neg_mean_squared_error", cv=10)
rmse_scores = np.sqrt(-scores)

As said in this book at page 108 this is the reason why we use -score:

Scikit-Learn cross-validation features expect a utility function (greater is better) rather than a cost function (lower is better), so the scoring function is actually the opposite of the MSE (i.e., a negative value), which is why the preceding code computes -scores before calculating the square root.

and to visualize the result use this simple function:

def display_scores(scores):
    print("Scores:", scores)
    print("Mean:", scores.mean())
    print("Standard deviation:", scores.std())

Python - k fold cross validation for linear_model.Lasso

Tags:

linear-regression

cross-validation

Ryo

1 Answers

Espoir Murhabazi

Recent Activity

Donate For Us