I have a test dataset and train dataset as below. I have provided a sample data with min records, but my data has than 1000's of records. Here E is my target variable which I need to predict using an algorithm. It has only four categories like 1,2,3,4. It can take only any of these values.
Training Dataset:
A B C D E
1 20 30 1 1
2 22 12 33 2
3 45 65 77 3
12 43 55 65 4
11 25 30 1 1
22 23 19 31 2
31 41 11 70 3
1 48 23 60 4
Test Dataset:
A B C D E
11 21 12 11
1 2 3 4
5 6 7 8
99 87 65 34
11 21 24 12
Since E has only 4 categories, I thought of predicting this using Multinomial Logistic Regression (1 vs Rest Logic). I am trying to implement it using python.
I know the logic that we need to set these targets in a variable and use an algorithm to predict any of these values:
output = [1,2,3,4]
But I am stuck at a point on how to use it using python (sklearn) to loop through these values and what algorithm should I use to predict the output values? Any help would be greatly appreciated
It computes the probability of an event occurrence. It is a special case of linear regression where the target variable is categorical in nature. It uses a log of odds as the dependent variable. Logistic Regression predicts the probability of occurrence of a binary event utilizing a logit function.
max_iterint, default=100. Maximum number of iterations taken for the solvers to converge.
Also, it gives a good insight on what the multinomial logistic regression is: a set of J−1 independent logistic regressions for the probability of Y=j versus the probability of the reference Y=J. Y = J . pj(x)=eβ0j+β1jX1+⋯+βpjXppJ(x).
You could try
LogisticRegression(multi_class='multinomial',solver ='newton-cg').fit(X_train,y_train)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With