ValueError: Unknown label type in scikit-learn

Tags:

I try to generate meta-features, so I follow tutorials and write the following:

clf = tree.DecisionTreeClassifier()

clf.fit(X, y)

But it raises ValueError.

File "/usr/local/lib/python2.7/dist-packages/sklearn/tree/tree.py", line 739, in fit
X_idx_sorted=X_idx_sorted)
File "/usr/local/lib/python2.7/dist-packages/sklearn/tree/tree.py", line 146, in fit
check_classification_targets(y)
File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/multiclass.py", line 172, in check_classification_targets
raise ValueError("Unknown label type: %r" % y_type)
ValueError: Unknown label type: 'unknown'

Why it raises?

The dataset consists of floats and integers, class labels are integers. describe() returns this:

         x1       x2       x3       x4      x5       x6       x7       x8  
count   3500.00  3500.00  3500.00  3500.00  3500.0  3500.00  3500.00  3500.00   
unique   501.00   516.00   572.00   650.00   724.0   779.00   828.00   757.00   
top        0.12     0.79     0.82     0.83     1.9     1.68     1.67     2.03   
freq      23.00    25.00    22.00    18.00    16.0    15.00    13.00    14.00   

         x9      x10  ...        x32      x33      x34      x35      x36  
count   3500.00  3500.00  ...    3500.00  3500.00  3500.00  3500.00  3500.00   
unique   730.00   676.00  ...     496.00   504.00   503.00   505.00   486.00   
top        3.27     3.47  ...       0.01     0.58    -0.27    -0.02     0.26   
freq      15.00    16.00  ...      23.00    24.00    26.00    23.00    24.00   

        x37      x38      x39      x40  class  
count   3500.00  3500.00  3500.00  3500.00   3500  
unique   488.00   490.00   492.00   506.00      3  
top       -0.03    -0.07     0.05    -0.19      1  
freq      23.00    25.00    22.00    24.00   1185

Dataset looks like this:

       x33   x34   x35   x36   x37   x38   x39   x40 class  
0     -0.7  0.51  0.34 -0.13 -0.87  0.56 -0.53  0.29     2  
1     1.12   0.6  0.28  2.17  0.18 -0.09 -1.33     1     1  
2     -0.3 -0.07 -0.99 -0.75  1.11  1.35 -1.63   0.1     0  
3    -0.29 -1.62  0.19 -1.04  0.43 -1.82 -1.14 -0.23     1  
4    -0.78 -0.12 -0.35  0.44  0.31 -0.45 -0.23  0.27     0  
5     0.28  0.61  -0.4 -1.96  1.26 -0.72  2.01  0.95     2  
6     0.07  1.91 -0.15 -0.27   1.9  1.14 -0.05  0.04     0  
7     1.52 -1.52 -0.16 -0.41 -0.48 -0.37   0.8   1.3     2  
8    -0.52 -1.41 -3.49  1.74 -0.37 -0.25 -0.63   0.2     2  
9     0.78  0.09  -0.7  1.12 -0.32 -0.43 -0.34 -1.04     2  
10    0.25  0.29 -0.73 -0.02  2.14  1.49  0.02 -2.16     2  
11   -1.72 -0.09  0.43 -0.33 -1.66 -0.73  1.45  2.11     2  
12   -0.01 -2.63 -1.91  0.59   0.8  0.35  1.58 -0.98     2

Its shape is [3500 rows x 41 columns].

255

asked Apr 25 '17 10:04

evaleria

1 Answers

There are two probable problem and solutions:
1. does your data has appropriate dimention? check it by X.shape() to insure your data is in appropriate format, you can also check this question
2. Try to convert your data to float by np.asarray(...,dtype=np.float64), you can also check this question

182

answered Oct 27 '22 16:10

Masoud

Related questions
                            
                                python3 + Pandas styles + Change alternate row color
                            
                                SSL Certification Error > hostname doesn't match
                            
                                Why is bokeh so much slower than matplotlib
                            
                                How to type hint a function that returns a function? [duplicate]
                            
                                How to add the second line of labels for axes
                            
                                Reindexing a pandas DataFrame using a dict (python3)
                            
                                Numpy reductions over successive non-contiguous slices
                            
                                Updating arrow position in matplotlib
                            
                                BigQuery invalid table name error when using Standard SQL in BigQuery API's
                            
                                Clear MatPlotLib figure in Jupyter Python notebook
                            
                                How to attach CSV file with MIME/SMTP and email?
                            
                                scikit-learn: cross_val_predict only works for partitions
                            
                                Convert data on reading csv in pandas
                            
                                Installing extras using conda
                            
                                pandas merge(how="inner") result is bigger than both dataframes
                            
                                How to remove b symbol in python3
                            
                                How to set up urls in django when there are multiple apps within the same project?
                            
                                How to execute multiple bash commands in parallel in python
                            
                                Where is a django validator function's return value stored?
                            
                                Seaborn ImportError: DLL load failed: The specified module could not be found

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

ValueError: Unknown label type in scikit-learn

Tags:

python

machine-learning

scikit-learn

evaleria

People also ask

1 Answers

Masoud

Recent Activity

Donate For Us