Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ordinal logistic regression: Intercept_ returns [1] instead of [n]

I'm running an ordinal (i.e. multinomial) ridge regression using mord (scikitlearn) library.

y is a single column containing integer values from 1 to 19.

X is made of 7 numerical variables binned in 4 buckets, and dummied into a final of 28 binary variables.

import pandas as pd
import numpy as np    
from sklearn import metrics
from sklearn.model_selection import train_test_split
import mord

in_X, out_X, in_y, out_y = train_test_split(X, y,
                                            stratify=y,
                                            test_size=0.3,
                                            random_state=42)

mul_lr = mord.OrdinalRidge(alpha=1.0,
                           fit_intercept=True,
                           normalize=False,
                           copy_X=True,
                           max_iter=None,
                           tol=0.001,
                           solver='auto').fit(in_X, in_y)

mul_lr.coef_ returns a [28 x 1] array but mul_lr.intercept_ returns a single value (instead of 19).

Any Idea what I am missing?

like image 288
Adav Avatar asked Feb 28 '19 14:02

Adav


People also ask

What are the assumptions of Ordinal Logistic Regression?

Assumptions. The dependent variable is measured on an ordinal level. One or more of the independent variables are either continious, categorical or ordinal. No Multi-collinearity - i.e. when two or more independent variables are highly correlated with each other.

How do you interpret Ordinal Logistic Regression?

For an ordinal regression, what you are looking to understand is how much closer each predictor pushes the outcome toward the next “jump up,” or increase into the next category of the outcome.

Can you use ordinal variables in logistic regression?

Ordinal logistic regression is a statistical analysis method that can be used to model the relationship between an ordinal response variable and one or more explanatory variables. An ordinal variable is a categorical variable for which there is a clear ordering of the category levels.

What is the intercept in ordinal regression?

Unlike simple linear regression, in ordinal logistic regression we obtain n-1 intercepts, where n is the number of categories in the dependent variable. The intercept can be interpreted as the expected odds of identifying in the listed categories.


1 Answers

If you would like your model to predict for all 19 categories, you need to first convert your label y to one hot encoding before training a model.

from sklearn.preprocessing import OneHotEncoder

y-=1 # range from 1 to 19 -> range from 0 to 18
enc = OneHotEncoder(n_values=19)
y = enc.fit_transform(y).toarray()
"""
train a model
"""

Now mul_lr.intercept_.shape should be (19,).

like image 154
keineahnung2345 Avatar answered Nov 09 '22 03:11

keineahnung2345