I try to run following code. Btw, I am new to both python and sklearn. <pre class="prettyprint"><code>import pandas as pd import numpy as np from sklearn.linear_model import LogisticRegression # data import and preparation trainData = pd.read_csv('train.csv') train = trainData.values testData = pd.read_csv('test.csv') test = testData.values X = np.c_[train[:, 0], train[:, 2], train[:, 6:7], train[:, 9]] X = np.nan_to_num(X) y = train[:, 1] Xtest = np.c_[test[:, 0:1], test[:, 5:6], test[:, 8]] Xtest = np.nan_to_num(Xtest) # model lr = LogisticRegression() lr.fit(X, y) </code></pre> where y is a np.ndarray of 0's and 1's I receive the following: <blockquote> File "C:\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py", line >1174, in fit check_classification_targets(y) File "C:\Anaconda3\lib\site-packages\sklearn\utils\multiclass.py", line 172, >in check_classification_targets raise ValueError("Unknown label type: %r" % y_type) ValueError: Unknown label type: 'unknown' </blockquote> from sklearn documentation: http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression.fit y : array-like, shape (n_samples,) Target values (class labels in classification, real numbers in regression) What is my error? upd: y is array([0.0, 1.0, 1.0, ..., 0.0, 1.0, 0.0], dtype=object) size is (891,)

Your <code>y</code> is of type <code>object</code>, so sklearn cannot recognize its type. Add the line <code>y=y.astype('int')</code> right after the line <code>y = train[:, 1]</code>.

ValueError: Unknown label type: 'unknown'

Tags:

python

pandas

numpy

scikit-learn

logistic-regression

I try to run following code. Btw, I am new to both python and sklearn.

import pandas as pd import numpy as np from sklearn.linear_model import LogisticRegression   # data import and preparation trainData = pd.read_csv('train.csv') train = trainData.values testData = pd.read_csv('test.csv') test = testData.values X = np.c_[train[:, 0], train[:, 2], train[:, 6:7],  train[:, 9]] X = np.nan_to_num(X) y = train[:, 1] Xtest = np.c_[test[:, 0:1], test[:, 5:6],  test[:, 8]] Xtest = np.nan_to_num(Xtest)   # model lr = LogisticRegression() lr.fit(X, y)

where y is a np.ndarray of 0's and 1's

I receive the following:

File "C:\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py", line >1174, in fit check_classification_targets(y)

File "C:\Anaconda3\lib\site-packages\sklearn\utils\multiclass.py", line 172, >in check_classification_targets raise ValueError("Unknown label type: %r" % y_type)

ValueError: Unknown label type: 'unknown'

from sklearn documentation: http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression.fit

y : array-like, shape (n_samples,) Target values (class labels in classification, real numbers in regression)

What is my error?

upd:

y is array([0.0, 1.0, 1.0, ..., 0.0, 1.0, 0.0], dtype=object) size is (891,)

265

asked Jul 27 '17 09:07

Ivan Zhovannik

1 Answers

Your y is of type object, so sklearn cannot recognize its type. Add the line y=y.astype('int') right after the line y = train[:, 1].

141

answered Sep 24 '22 15:09

Miriam Farber

Related questions
                            
                                How to export virtualenv?
                            
                                How to join two dataframes for which column values are within a certain range?
                            
                                Python Requests requests.exceptions.SSLError: [Errno 8] _ssl.c:504: EOF occurred in violation of protocol
                            
                                How do you retrieve items from a dictionary in the order that they're inserted?
                            
                                watchdog monitoring file for changes
                            
                                How to downgrade the installed version of 'pip' on windows?
                            
                                How to create a conditional task in Airflow
                            
                                OpenCV - Apply mask to a color image
                            
                                Using Python's list index() method on a list of tuples or objects?
                            
                                Multiple assignment and evaluation order in Python
                            
                                Detect whether a Python string is a number or a letter [duplicate]
                            
                                How to switch to new window in Selenium for Python?
                            
                                How to install a Python module via its setup.py in Windows? [closed]
                            
                                Correlation heatmap
                            
                                How to determine file, function and line number?
                            
                                Nested list comprehension with two lists
                            
                                export notebook to pdf without code [duplicate]
                            
                                Reading two text files line by line simultaneously
                            
                                How to convert column with string type to int form in pyspark data frame?
                            
                                Identify the changed fields in django post_save signal

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With