Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract dollar amount from pandas DataFrame column

Tags:

I would to get dollar amounts from more than hundreds rows in a column, and then save the amount in a new column. The dollar amount varies in each row, like $100.01, $1,000.05, 10,000, 100,000 etc.

One of the lines looks like this:

Approving the settlement claim of Mr. X Y by payment in the amount of $120,000.65

I tried to do something like this, but it's not extracting the dollar amount:

df['amount'] = df['description'].str.extract('/(\$[0-9]+(\.[0-9]{2})?)/', expand=True)

Please help.

like image 258
Qashin Avatar asked Aug 18 '18 04:08

Qashin


People also ask

How do I extract values from a DataFrame column?

You can extract a column of pandas DataFrame based on another value by using the DataFrame. query() method. The query() is used to query the columns of a DataFrame with a boolean expression. The blow example returns a Courses column where the Fee column value matches with 25000.

How do I extract a value from a pandas DataFrame in Python?

get_value() function is used to quickly retrieve the single value in the data frame at the passed column and index. The input to the function is the row label and the column label.

How get values from column in pandas?

You can use the loc and iloc functions to access columns in a Pandas DataFrame. Let's see how. If we wanted to access a certain column in our DataFrame, for example the Grades column, we could simply use the loc function and specify the name of the column in order to retrieve it.


1 Answers

IIUC need:

df=pd.DataFrame({'description':['ss $100.01', 'dd $1,000.05', 
                                'f 10,000', 'g 100,000',
                                'yr 4,120,000.65']})

df['amount'] = df['description'].str.extract('([0-9,.]+)')
print (df)
       description        amount
0       ss $100.01        100.01
1     dd $1,000.05      1,000.05
2         f 10,000        10,000
3        g 100,000       100,000
4  yr 4,120,000.65  4,120,000.65

EDIT:

df['amount1'] = df['description'].str.extract('(\$[0-9,.]+)')
df['amount2'] = df['description'].str.extract('\$([0-9,.]+)')
print (df)

       description    amount1   amount2
0       ss $100.01    $100.01    100.01
1     dd $1,000.05  $1,000.05  1,000.05
2         f 10,000        NaN       NaN
3        g 100,000        NaN       NaN
4  yr 4,120,000.65        NaN       NaN
like image 192
jezrael Avatar answered Oct 11 '22 13:10

jezrael