Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert number strings with commas in pandas DataFrame to float

Tags:

python

pandas

I have a DataFrame that contains numbers as strings with commas for the thousands marker. I need to convert them to floats.

a = [['1,200', '4,200'], ['7,000', '-0.03'], [ '5', '0']] df=pandas.DataFrame(a) 

I am guessing I need to use locale.atof. Indeed

df[0].apply(locale.atof) 

works as expected. I get a Series of floats.

But when I apply it to the DataFrame, I get an error.

df.apply(locale.atof) 

TypeError: ("cannot convert the series to ", u'occurred at index 0')

and

df[0:1].apply(locale.atof) 

gives another error:

ValueError: ('invalid literal for float(): 1,200', u'occurred at index 0')

So, how do I convert this DataFrame of strings to a DataFrame of floats?

like image 390
pheon Avatar asked Mar 03 '14 02:03

pheon


People also ask

How do I remove commas from numbers in pandas?

In our first python program code, we use replace() method to eliminate all the commas (,) from a python string. The replace() command returns a replica of the string where a substring's existence is exchanged with another substring. Using replace() function, we swap the commas in the python string with null elements.

How do you break a comma separated string in a pandas column?

Since you have a list of comma separated strings, split the string on comma to get a list of elements, then call explode on that column.


1 Answers

If you're reading in from csv then you can use the thousands arg:

df.read_csv('foo.tsv', sep='\t', thousands=',') 

This method is likely to be more efficient than performing the operation as a separate step.


You need to set the locale first:

In [ 9]: import locale  In [10]: from locale import atof  In [11]: locale.setlocale(locale.LC_NUMERIC, '') Out[11]: 'en_GB.UTF-8'  In [12]: df.applymap(atof) Out[12]:       0        1 0  1200  4200.00 1  7000    -0.03 2     5     0.00 
like image 158
Andy Hayden Avatar answered Oct 15 '22 10:10

Andy Hayden