This is the head of a train data set.
Head of the X_Train
Running the below code:
logit = sm.GLM(Y_train, X_train, family=sm.families.Binomial())
result = logit.fit()
Can you please help?
Getting the below error : Error Screen Shot
This happens when all or nearly all of the values in one of the predictor categories (or a combination of predictors) are associated with only one of the binary outcome values.
A complete separation in a logistic regression, sometimes also referred as perfect prediction, happens when the outcome variable separates a predictor variable completely. Below is an example data set, where Y is the outcome variable, and X1 and X2 are predictor variables.
Python has detected a complete or quasi-complete separation in one or more of your predictors and the outcome variable.
This happens when all or nearly all of the values in one of the predictor categories (or a combination of predictors) are associated with only one of the binary outcome values. (I'm assuming you're attempting a logistic regression.) When this happens a solution cannot be found for the predictor coefficient.
There are several possible solutions. Depending on how many variables are in your analysis, you can try running two-way crosstabs on your outcome and each of the predictor variables to locate any cells with zero observations, and then drop that variable from the analysis or use fewer categories. Another option is to run a Firth logistic regression or a penalized regression.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With