Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LabelEncoder: TypeError: '>' not supported between instances of 'float' and 'str'

I'm facing this error for multiple variables even treating missing values. For example:

le = preprocessing.LabelEncoder() categorical = list(df.select_dtypes(include=['object']).columns.values) for cat in categorical:     print(cat)     df[cat].fillna('UNK', inplace=True)     df[cat] = le.fit_transform(df[cat]) #     print(le.classes_) #     print(le.transform(le.classes_))   --------------------------------------------------------------------------- TypeError                                 Traceback (most recent call last) <ipython-input-24-424a0952f9d0> in <module>()       4     print(cat)       5     df[cat].fillna('UNK', inplace=True) ----> 6     df[cat] = le.fit_transform(df[cat].fillna('UNK'))       7 #     print(le.classes_)       8 #     print(le.transform(le.classes_))  C:\Users\paula.ceccon.ribeiro\AppData\Local\Continuum\Anaconda3\lib\site-packages\sklearn\preprocessing\label.py in fit_transform(self, y)     129         y = column_or_1d(y, warn=True)     130         _check_numpy_unicode_bug(y) --> 131         self.classes_, y = np.unique(y, return_inverse=True)     132         return y     133   C:\Users\paula.ceccon.ribeiro\AppData\Local\Continuum\Anaconda3\lib\site-packages\numpy\lib\arraysetops.py in unique(ar, return_index, return_inverse, return_counts)     209      210     if optional_indices: --> 211         perm = ar.argsort(kind='mergesort' if return_index else 'quicksort')     212         aux = ar[perm]     213     else:  TypeError: '>' not supported between instances of 'float' and 'str' 

Checking the variable that lead to the error results ins:

df['CRM do Médico'].isnull().sum() 0 

Besides nan values, what could be causing this error?

like image 289
pceccon Avatar asked Sep 25 '17 13:09

pceccon


People also ask

How do I fix TypeError not supported between instances of float and str?

The Python "TypeError: '>' not supported between instances of 'float' and 'str'" occurs when we use a comparison operator between values of type float and str . To solve the error, convert the string to a float before comparing, e.g. my_float > float(my_str) .

What is not supported between instances of STR and INT?

The Python "TypeError: '<' not supported between instances of 'str' and 'int'" occurs when we use a comparison operator between values of type str and int . To solve the error, convert the string to an integer before comparing, e.g. int(my_str) < my_int .


1 Answers

This is due to the series df[cat] containing elements that have varying data types e.g.(strings and/or floats). This could be due to the way the data is read, i.e. numbers are read as float and text as strings or the datatype was float and changed after the fillna operation.

In other words

pandas data type 'Object' indicates mixed types rather than str type

so using the following line:

df[cat] = le.fit_transform(df[cat].astype(str)) 


should help

like image 196
sgDysregulation Avatar answered Oct 07 '22 17:10

sgDysregulation