Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

converting currency with $ to numbers in Python pandas

I have the following data in pandas dataframe:

    state        1st        2nd             3rd 0   California  $11,593,820 $109,264,246    $8,496,273 1   New York    $10,861,680 $45,336,041     $6,317,300 2   Florida     $7,942,848  $69,369,589     $4,697,244 3   Texas       $7,536,817  $61,830,712     $5,736,941 

I want to perform some simple analysis (e.g., sum, groupby) with three columns (1st, 2nd, 3rd), but the data type of those three columns is object (or string).

So I used the following code for data conversion:

data = data.convert_objects(convert_numeric=True) 

But, conversion does not work, perhaps, due to the dollar sign. Any suggestion?

like image 763
kevin Avatar asked Sep 08 '15 17:09

kevin


People also ask

How do I convert items to numeric in Pandas?

to_numeric() The best way to convert one or more columns of a DataFrame to numeric values is to use pandas. to_numeric(). This function will try to change non-numeric objects (such as strings) into integers or floating-point numbers as appropriate.

How do you convert non-numeric data to numeric data in Python?

To encode non-numeric data to numeric you can use scikit-learn's LabelEncoder. It will encode each category such as COL1's a , b , c to integers. enc. fit() creates the corresponding integer values.


1 Answers

@EdChum's answer is clever and works well. But since there's more than one way to bake a cake.... why not use regex? For example:

df[df.columns[1:]] = df[df.columns[1:]].replace('[\$,]', '', regex=True).astype(float) 

To me, that is a little bit more readable.

like image 110
dagrha Avatar answered Oct 22 '22 09:10

dagrha