LabelEncoder: TypeError: '>' not supported between instances of 'float' and 'str'

Tags:

I'm facing this error for multiple variables even treating missing values. For example:

le = preprocessing.LabelEncoder() categorical = list(df.select_dtypes(include=['object']).columns.values) for cat in categorical:     print(cat)     df[cat].fillna('UNK', inplace=True)     df[cat] = le.fit_transform(df[cat]) #     print(le.classes_) #     print(le.transform(le.classes_))   --------------------------------------------------------------------------- TypeError                                 Traceback (most recent call last) <ipython-input-24-424a0952f9d0> in <module>()       4     print(cat)       5     df[cat].fillna('UNK', inplace=True) ----> 6     df[cat] = le.fit_transform(df[cat].fillna('UNK'))       7 #     print(le.classes_)       8 #     print(le.transform(le.classes_))  C:\Users\paula.ceccon.ribeiro\AppData\Local\Continuum\Anaconda3\lib\site-packages\sklearn\preprocessing\label.py in fit_transform(self, y)     129         y = column_or_1d(y, warn=True)     130         _check_numpy_unicode_bug(y) --> 131         self.classes_, y = np.unique(y, return_inverse=True)     132         return y     133   C:\Users\paula.ceccon.ribeiro\AppData\Local\Continuum\Anaconda3\lib\site-packages\numpy\lib\arraysetops.py in unique(ar, return_index, return_inverse, return_counts)     209      210     if optional_indices: --> 211         perm = ar.argsort(kind='mergesort' if return_index else 'quicksort')     212         aux = ar[perm]     213     else:  TypeError: '>' not supported between instances of 'float' and 'str'

Checking the variable that lead to the error results ins:

df['CRM do Médico'].isnull().sum() 0

Besides nan values, what could be causing this error?

289

asked Sep 25 '17 13:09

pceccon

1 Answers

This is due to the series df[cat] containing elements that have varying data types e.g.(strings and/or floats). This could be due to the way the data is read, i.e. numbers are read as float and text as strings or the datatype was float and changed after the fillna operation.

In other words

pandas data type 'Object' indicates mixed types rather than str type

so using the following line:

df[cat] = le.fit_transform(df[cat].astype(str))

should help

196

answered Oct 07 '22 17:10

sgDysregulation

Related questions
                            
                                Easy way of finding decimal places
                            
                                Convert float to string in positional format (without scientific notation and false precision)
                            
                                What version of Python is on my Mac?
                            
                                Why does Pandas inner join give ValueError: len(left_on) must equal the number of levels in the index of "right"?
                            
                                Python - Move and overwrite files and folders
                            
                                Integrate Python And C++
                            
                                What are good rules of thumb for Python imports?
                            
                                Copied variable changes the original?
                            
                                import httplib ImportError: No module named httplib
                            
                                Any tutorials for developing chatbots? [closed]
                            
                                RandomForestClassfier.fit(): ValueError: could not convert string to float
                            
                                Python float to int conversion
                            
                                How do I make an auto increment integer field in Django?
                            
                                What are the differences between add_axes and add_subplot?
                            
                                what's the biggest difference between dir and __dict__ in python
                            
                                Pandas - Filtering None Values
                            
                                Python: Concatenate (or clone) a numpy array N times
                            
                                TypeError: attrib() got an unexpected keyword argument 'convert'
                            
                                Logging, StreamHandler and standard streams
                            
                                Multiplying a tuple by a scalar

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

LabelEncoder: TypeError: '>' not supported between instances of 'float' and 'str'

Tags:

python

pandas

scikit-learn

pceccon

People also ask

1 Answers

sgDysregulation

Recent Activity

Donate For Us