I'm trying to understand the difference between RidgeClassifier and LogisticRegression in sklearn.linear_model
. I couldn't find it in the documentation.
I think I understand quite well what the LogisticRegression does.It computes the coefficients and intercept to minimise half of sum of squares of the coefficients + C times the binary cross-entropy loss
, where C is the regularisation parameter. I checked against a naive implementation from scratch, and results coincide.
Results of RidgeClassifier differ and I couldn't figure out, how the coefficients and intercept are computed there? Looking at the Github code, I'm not experienced enough to untangle it.
The reason why I'm asking is that I like the RidgeClassifier results -- it generalises a bit better to my problem. But before I use it, I would like to at least have an idea where does it come from.
Thanks for possible help.
Classifier using Ridge regression. This classifier first converts the target values into {-1, 1} and then treats the problem as a regression task (multi-output regression in the multiclass case).
In machine learning, ridge classification is a technique used to analyze linear discriminant models. It is a form of regularization that penalizes model coefficients to prevent overfitting.
Yes, ridge regression can be used as a classifier, just code the response labels as -1 and +1 and fit the regression model as normal.
Photo Credit: Scikit-Learn. Logistic Regression is a Machine Learning classification algorithm that is used to predict the probability of a categorical dependent variable. In logistic regression, the dependent variable is a binary variable that contains data coded as 1 (yes, success, etc.) or 0 (no, failure, etc.).
RidgeClassifier()
works differently compared to LogisticRegression()
with l2 penalty. The loss function for RidgeClassifier()
is not cross entropy.
RidgeClassifier()
uses Ridge()
regression model in the following way to create a classifier:
Let us consider binary classification for simplicity.
Convert target variable into +1
or -1
based on the class in which it belongs to.
Build a Ridge()
model (which is a regression model) to predict our target variable. The loss function is MSE + l2 penalty
If the Ridge()
regression's prediction value (calculated based on decision_function()
function) is greater than 0, then predict as positive class else negative class.
For multi-class classification:
Use LabelBinarizer()
to create a multi-output regression scenario, and then train independent Ridge()
regression models, one for each class (One-Vs-Rest modelling).
Get prediction from each class's Ridge()
regression model (a real number for each class) and then use argmax
to predict the class.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With