Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

I want to multiply two columns in a pandas DataFrame and add the result into a new column

I'm trying to multiply two existing columns in a pandas Dataframe (orders_df) - Prices (stock close price) and Amount (stock quantities) and add the calculation to a new column called 'Value'. For some reason when I run this code, all the rows under the 'Value' column are positive numbers, while some of the rows should be negative. Under the Action column in the DataFrame there are seven rows with the 'Sell' string and seven with the 'Buy' string.

for i in orders_df.Action:  if i  == 'Sell':   orders_df['Value'] = orders_df.Prices*orders_df.Amount  elif i == 'Buy':   orders_df['Value'] = -orders_df.Prices*orders_df.Amount) 

Please let me know what i'm doing wrong !

like image 767
OAK Avatar asked Dec 27 '12 18:12

OAK


People also ask

How do you create a column by multiplying two columns in pandas?

Use the syntax df[col1] * df[col2] to multiply columns with names col1 and col2 in df . Use DataFrame indexing to assign the result to a new column.

How do I add values to multiple columns in pandas?

By use + operator simply you can combine/merge two or multiple text/string columns in pandas DataFrame. Note that when you apply + operator on numeric columns it actually does addition instead of concatenation.

How do I multiply a column in pandas DataFrame?

Use the * operator to multiply a column by a constant number Select a column of DataFrame df using syntax df["column_name"] and set it equal to n * df["column_name"] where n is the number to multiply by.


2 Answers

I think an elegant solution is to use the where method (also see the API docs):

In [37]: values = df.Prices * df.Amount  In [38]: df['Values'] = values.where(df.Action == 'Sell', other=-values)  In [39]: df Out[39]:     Prices  Amount Action  Values 0       3      57   Sell     171 1      89      42   Sell    3738 2      45      70    Buy   -3150 3       6      43   Sell     258 4      60      47   Sell    2820 5      19      16    Buy    -304 6      56      89   Sell    4984 7       3      28    Buy     -84 8      56      69   Sell    3864 9      90      49    Buy   -4410 

Further more this should be the fastest solution.

like image 166
bmu Avatar answered Oct 03 '22 12:10

bmu


You can use the DataFrame apply method:

order_df['Value'] = order_df.apply(lambda row: (row['Prices']*row['Amount']                                                if row['Action']=='Sell'                                                else -row['Prices']*row['Amount']),                                    axis=1) 

It is usually faster to use these methods rather than over for loops.

like image 23
Andy Hayden Avatar answered Oct 03 '22 14:10

Andy Hayden