Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cannot convert string to float in pandas (ValueError)

I have a dataframe created form a JSON output that looks like this:

        Total Revenue    Average Revenue    Purchase count    Rate
Date    
Monday  1,304.40 CA$     20.07 CA$          2,345             1.54 %

The value stored are received as string from the JSON. I am trying to:

1) Remove all characters in the entry (ex: CA$ or %) 2) convert rate and revenue columns to float 3) Convert count columns as int

I tried to do the following:

df[column] = (df[column].str.split()).apply(lambda x: float(x[0]))

It works fine except when I have a value with a coma (ex: 1,465 won't work whereas 143 would).

I tried to use several function to replace the "," by "", etc. Nothing worked so far. I always receive the following error:

ValueError: could not convert string to float: '1,304.40'

like image 396
John_Mtl Avatar asked Aug 24 '16 14:08

John_Mtl


People also ask

Could not convert string to float Pandas series?

The error ValueError: could not convert string to float occurs if you try to convert a string to a float that contains invalid characters. To solve this error, check the string for characters that you can remove and use the strip() method to get rid of them.

Could convert string to float?

We can convert a string to float in Python using the float() function. This is a built-in function used to convert an object to a floating point number. Internally, the float() function calls specified object __float__() function.

How do I remove white spaces from a column in Python?

To strip whitespace from columns in Pandas we can use the str. strip(~) method or the str. replace(~) method.


Video Answer


2 Answers

These strings have commas as thousands separators so you will have to remove them before the call to float:

df[column] = (df[column].str.split()).apply(lambda x: float(x[0].replace(',', '')))

This can be simplified a bit by moving split inside the lambda:

df[column] = df[column].apply(lambda x: float(x.split()[0].replace(',', '')))
like image 154
DeepSpace Avatar answered Sep 28 '22 01:09

DeepSpace


Another solution with list comprehension, if need apply string functions working only with Series (columns of DataFrame) like str.split and str.replace:

df = pd.concat([df[col].str.split()
                       .str[0]
                       .str.replace(',','').astype(float) for col in df], axis=1)

#if need convert column Purchase count to int
df['Purchase count'] = df['Purchase count'].astype(int)
print (df)
         Total Revenue  Average Revenue  Purchase count  Rate
Date                                                        
Monday         1304.4            20.07            2345  1.54
like image 37
jezrael Avatar answered Sep 28 '22 02:09

jezrael