I'm running an ordinal (i.e. multinomial) ridge regression using mord
(scikitlearn
) library.
y
is a single column containing integer values from 1 to 19.
X
is made of 7 numerical variables binned in 4 buckets, and dummied into a final of 28 binary variables.
import pandas as pd
import numpy as np
from sklearn import metrics
from sklearn.model_selection import train_test_split
import mord
in_X, out_X, in_y, out_y = train_test_split(X, y,
stratify=y,
test_size=0.3,
random_state=42)
mul_lr = mord.OrdinalRidge(alpha=1.0,
fit_intercept=True,
normalize=False,
copy_X=True,
max_iter=None,
tol=0.001,
solver='auto').fit(in_X, in_y)
mul_lr.coef_
returns a [28 x 1] array but mul_lr.intercept_
returns a single value (instead of 19).
Any Idea what I am missing?
Assumptions. The dependent variable is measured on an ordinal level. One or more of the independent variables are either continious, categorical or ordinal. No Multi-collinearity - i.e. when two or more independent variables are highly correlated with each other.
For an ordinal regression, what you are looking to understand is how much closer each predictor pushes the outcome toward the next “jump up,” or increase into the next category of the outcome.
Ordinal logistic regression is a statistical analysis method that can be used to model the relationship between an ordinal response variable and one or more explanatory variables. An ordinal variable is a categorical variable for which there is a clear ordering of the category levels.
Unlike simple linear regression, in ordinal logistic regression we obtain n-1 intercepts, where n is the number of categories in the dependent variable. The intercept can be interpreted as the expected odds of identifying in the listed categories.
If you would like your model to predict for all 19 categories, you need to first convert your label y
to one hot encoding before training a model.
from sklearn.preprocessing import OneHotEncoder
y-=1 # range from 1 to 19 -> range from 0 to 18
enc = OneHotEncoder(n_values=19)
y = enc.fit_transform(y).toarray()
"""
train a model
"""
Now mul_lr.intercept_.shape
should be (19,)
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With