LogisticRegression: Unknown label type: 'continuous' using sklearn in python

Tags:

I have the following code to test some of most popular ML algorithms of sklearn python library:

import numpy as np from sklearn                        import metrics, svm from sklearn.linear_model           import LinearRegression from sklearn.linear_model           import LogisticRegression from sklearn.tree                   import DecisionTreeClassifier from sklearn.neighbors              import KNeighborsClassifier from sklearn.discriminant_analysis  import LinearDiscriminantAnalysis from sklearn.naive_bayes            import GaussianNB from sklearn.svm                    import SVC  trainingData    = np.array([ [2.3, 4.3, 2.5],  [1.3, 5.2, 5.2],  [3.3, 2.9, 0.8],  [3.1, 4.3, 4.0]  ]) trainingScores  = np.array( [3.4, 7.5, 4.5, 1.6] ) predictionData  = np.array([ [2.5, 2.4, 2.7],  [2.7, 3.2, 1.2] ])  clf = LinearRegression() clf.fit(trainingData, trainingScores) print("LinearRegression") print(clf.predict(predictionData))  clf = svm.SVR() clf.fit(trainingData, trainingScores) print("SVR") print(clf.predict(predictionData))  clf = LogisticRegression() clf.fit(trainingData, trainingScores) print("LogisticRegression") print(clf.predict(predictionData))  clf = DecisionTreeClassifier() clf.fit(trainingData, trainingScores) print("DecisionTreeClassifier") print(clf.predict(predictionData))  clf = KNeighborsClassifier() clf.fit(trainingData, trainingScores) print("KNeighborsClassifier") print(clf.predict(predictionData))  clf = LinearDiscriminantAnalysis() clf.fit(trainingData, trainingScores) print("LinearDiscriminantAnalysis") print(clf.predict(predictionData))  clf = GaussianNB() clf.fit(trainingData, trainingScores) print("GaussianNB") print(clf.predict(predictionData))  clf = SVC() clf.fit(trainingData, trainingScores) print("SVC") print(clf.predict(predictionData))

The first two works ok, but I got the following error in LogisticRegression call:

root@ubupc1:/home/ouhma# python stack.py  LinearRegression [ 15.72023529   6.46666667] SVR [ 3.95570063  4.23426243] Traceback (most recent call last):   File "stack.py", line 28, in <module>     clf.fit(trainingData, trainingScores)   File "/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/logistic.py", line 1174, in fit     check_classification_targets(y)   File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/multiclass.py", line 172, in check_classification_targets     raise ValueError("Unknown label type: %r" % y_type) ValueError: Unknown label type: 'continuous'

The input data is the same as in the previous calls, so what is going on here?

And by the way, why there is a huge diference in the first prediction of LinearRegression() and SVR() algorithms (15.72 vs 3.95)?

922

asked Jan 29 '17 19:01

mllamazares

2 Answers

You are passing floats to a classifier which expects categorical values as the target vector. If you convert it to int it will be accepted as input (although it will be questionable if that's the right way to do it).

It would be better to convert your training scores by using scikit's labelEncoder function.

The same is true for your DecisionTree and KNeighbors qualifier.

from sklearn import preprocessing from sklearn import utils  lab_enc = preprocessing.LabelEncoder() encoded = lab_enc.fit_transform(trainingScores) >>> array([1, 3, 2, 0], dtype=int64)  print(utils.multiclass.type_of_target(trainingScores)) >>> continuous  print(utils.multiclass.type_of_target(trainingScores.astype('int'))) >>> multiclass  print(utils.multiclass.type_of_target(encoded)) >>> multiclass

131

answered Sep 17 '22 03:09

Maximilian Peters

LogisticRegression is not for regression but classification !

The Y variable must be the classification class,

(for example 0 or 1)

And not a continuous variable,

that would be a regression problem.

answered Sep 18 '22 03:09

Tomas G.

Related questions
                            
                                how to turn on minor ticks only on y axis matplotlib
                            
                                Interpreting a benchmark in C, Clojure, Python, Ruby, Scala and others [closed]
                            
                                Boolean identity == True vs is True
                            
                                How to check the version of scipy
                            
                                Using multiprocessing.Process with a maximum number of simultaneous processes
                            
                                Get a sub-set of a Python dictionary
                            
                                How can I plot NaN values as a special color with imshow in matplotlib?
                            
                                Is it possible to run function in a subprocess without threading or writing a separate file/script.
                            
                                reading tar file contents without untarring it, in python script
                            
                                converting JSON to string in Python
                            
                                Shared memory in multiprocessing
                            
                                Django: add image in an ImageField from image url
                            
                                Convert string into datetime.time object
                            
                                Is there a built-in product() in Python? [duplicate]
                            
                                How to unimport a python module which is already imported?
                            
                                How can I assign the value of a variable using eval in python?
                            
                                MongoDB ORM for Python? [closed]
                            
                                Django reverse lookup of foreign keys
                            
                                Parsing a YAML file in Python, and accessing the data?
                            
                                What is the difference between Lock and RLock

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

LogisticRegression: Unknown label type: 'continuous' using sklearn in python

Tags:

python

numpy

scikit-learn

mllamazares

People also ask

2 Answers

Maximilian Peters

Tomas G.

Recent Activity

Donate For Us