Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kernel in a logistic regression model LogisticRegression scikit-learn sklearn

How can I use a kernel in a logistic regression model using the sklearn library?

logreg = LogisticRegression()

logreg.fit(X_train, y_train)

y_pred = logreg.predict(X_test)
print(y_pred)

print(confusion_matrix(y_test,y_pred))
print(classification_report(y_test,y_pred))
predicted= logreg.predict(predict)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
like image 860
Rubiks Avatar asked Nov 07 '18 22:11

Rubiks


People also ask

How do you avoid overfitting in logistic regression Sklearn?

In order to avoid overfitting, it is necessary to use additional techniques (e.g. cross-validation, regularization, early stopping, pruning, or Bayesian priors).

What is kernel logistic regression?

Kernel logistic regression is a technique that extends regular logistic regression to deal with data that is not linearly separable.

How do you do Sklearn in logistic regression?

Building ML Regression Models using Scikit-Learn Logistic regression, despite its name, is a classification algorithm rather than regression algorithm. Based on a given set of independent variables, it is used to estimate discrete value (0 or 1, yes/no, true/false). It is also called logit or MaxEnt Classifier.

What is kernel in Sklearn?

The RBF kernel is a stationary kernel. It is also known as the “squared exponential” kernel. It is parameterized by a length scale parameter , which can either be a scalar (isotropic variant of the kernel) or a vector with the same number of dimensions as the inputs X (anisotropic variant of the kernel).


1 Answers

Very nice question but scikit-learn currently does not support neither kernel logistic regression nor the ANOVA kernel.

You can implement it though.

Example 1 for the ANOVA kernel:

import numpy as np
from sklearn.metrics.pairwise import check_pairwise_arrays
from scipy.linalg import cholesky
from sklearn.linear_model import LogisticRegression

def anova_kernel(X, Y=None, gamma=None, p=1):
    X, Y = check_pairwise_arrays(X, Y)
    if gamma is None:
        gamma = 1. / X.shape[1]

    diff = X[:, None, :] - Y[None, :, :]
    diff **= 2
    diff *= -gamma
    np.exp(diff, out=diff)
    K = diff.sum(axis=2)
    K **= p
    return K

# Kernel matrix based on X matrix of all data points
K = anova_kernel(X)
R = cholesky(K, lower=False)

# Define the model
clf = LogisticRegression()

# Here, I assume that you have split the data and here, train are the indices for the training set
clf.fit(R[train], y_train)
preds = clf.predict(R[test])¨

Example 2 for Nyström:

from sklearn.kernel_approximation import Nystroem
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline

K_train = anova_kernel(X_train)
clf = Pipeline([
    ('nys', Nystroem(kernel='precomputed', n_components=100)),
    ('lr', LogisticRegression())])
clf.fit(K_train, y_train)

K_test = anova_kernel(X_test, X_train)
preds = clf.predict(K_test)
like image 189
seralouk Avatar answered Sep 23 '22 16:09

seralouk