Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ValueError: Unknown label type in scikit-learn

I try to generate meta-features, so I follow tutorials and write the following:

clf = tree.DecisionTreeClassifier()

clf.fit(X, y)

But it raises ValueError.

File "/usr/local/lib/python2.7/dist-packages/sklearn/tree/tree.py", line 739, in fit
X_idx_sorted=X_idx_sorted)
File "/usr/local/lib/python2.7/dist-packages/sklearn/tree/tree.py", line 146, in fit
check_classification_targets(y)
File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/multiclass.py", line 172, in check_classification_targets
raise ValueError("Unknown label type: %r" % y_type)
ValueError: Unknown label type: 'unknown'

Why it raises?

The dataset consists of floats and integers, class labels are integers. describe() returns this:

         x1       x2       x3       x4      x5       x6       x7       x8  
count   3500.00  3500.00  3500.00  3500.00  3500.0  3500.00  3500.00  3500.00   
unique   501.00   516.00   572.00   650.00   724.0   779.00   828.00   757.00   
top        0.12     0.79     0.82     0.83     1.9     1.68     1.67     2.03   
freq      23.00    25.00    22.00    18.00    16.0    15.00    13.00    14.00   

         x9      x10  ...        x32      x33      x34      x35      x36  
count   3500.00  3500.00  ...    3500.00  3500.00  3500.00  3500.00  3500.00   
unique   730.00   676.00  ...     496.00   504.00   503.00   505.00   486.00   
top        3.27     3.47  ...       0.01     0.58    -0.27    -0.02     0.26   
freq      15.00    16.00  ...      23.00    24.00    26.00    23.00    24.00   

        x37      x38      x39      x40  class  
count   3500.00  3500.00  3500.00  3500.00   3500  
unique   488.00   490.00   492.00   506.00      3  
top       -0.03    -0.07     0.05    -0.19      1  
freq      23.00    25.00    22.00    24.00   1185  

Dataset looks like this:

       x33   x34   x35   x36   x37   x38   x39   x40 class  
0     -0.7  0.51  0.34 -0.13 -0.87  0.56 -0.53  0.29     2  
1     1.12   0.6  0.28  2.17  0.18 -0.09 -1.33     1     1  
2     -0.3 -0.07 -0.99 -0.75  1.11  1.35 -1.63   0.1     0  
3    -0.29 -1.62  0.19 -1.04  0.43 -1.82 -1.14 -0.23     1  
4    -0.78 -0.12 -0.35  0.44  0.31 -0.45 -0.23  0.27     0  
5     0.28  0.61  -0.4 -1.96  1.26 -0.72  2.01  0.95     2  
6     0.07  1.91 -0.15 -0.27   1.9  1.14 -0.05  0.04     0  
7     1.52 -1.52 -0.16 -0.41 -0.48 -0.37   0.8   1.3     2  
8    -0.52 -1.41 -3.49  1.74 -0.37 -0.25 -0.63   0.2     2  
9     0.78  0.09  -0.7  1.12 -0.32 -0.43 -0.34 -1.04     2  
10    0.25  0.29 -0.73 -0.02  2.14  1.49  0.02 -2.16     2  
11   -1.72 -0.09  0.43 -0.33 -1.66 -0.73  1.45  2.11     2  
12   -0.01 -2.63 -1.91  0.59   0.8  0.35  1.58 -0.98     2  

Its shape is [3500 rows x 41 columns].

like image 255
evaleria Avatar asked Apr 25 '17 10:04

evaleria


People also ask

How do I fix ValueError unknown label type continuous?

The way to resolve this error is to simply convert the continuous values of the response variable to categorical values using the LabelEncoder() function from sklearn: What is this? Each of the original values is now encoded as a 0 or 1.

What is unknown label type continuous?

In feature selection, if the target value is normalized (to number between one and zero) it gives the error value " Unknown label type: 'continuous' ". But if this target value is number other than the decimal between zero & 1 the program can work.


1 Answers

There are two probable problem and solutions:
1. does your data has appropriate dimention? check it by X.shape() to insure your data is in appropriate format, you can also check this question
2. Try to convert your data to float by np.asarray(...,dtype=np.float64), you can also check this question

like image 182
Masoud Avatar answered Oct 27 '22 16:10

Masoud