Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Subtract two columns in dataframe

Tags:

My df looks as follows:

Index    Country    Val1  Val2 ... Val10 1        Australia  1     3    ... 5 2        Bambua     12    33   ... 56 3        Tambua     14    34   ... 58 

I'd like to substract Val10 from Val1 for each country, so output looks like:

Country    Val10-Val1 Australia  4 Bambua     23 Tambua     24 

So far I've got:

def myDelta(row):     data = row[['Val10', 'Val1']]     return pd.Series({'Delta': np.subtract(data)})  def runDeltas():     myDF = getDF() \         .apply(myDelta, axis=1) \         .sort_values(by=['Delta'], ascending=False)     return myDF 

runDeltas results in this error:

ValueError: ('invalid number of arguments', u'occurred at index 9') 

What's the proper way to fix this?

like image 586
Peter Avatar asked Jan 19 '18 23:01

Peter


People also ask

How do I subtract two columns in pandas?

We can create a function specifically for subtracting the columns, by taking column data as arguments and then using the apply method to apply it to all the data points throughout the column.

How do you subtract from a data frame?

subtract() function is used for finding the subtraction of dataframe and other, element-wise. This function is essentially same as doing dataframe – other but with a support to substitute for missing data in one of the inputs.

How do you subtract two rows in a data frame?

Pandas offers a number of functions related to adjusting rows and enabling you to calculate the difference between them. For example, the Pandas shift method allows us to shift a dataframe in different directions, for example up and down. Because of this, we can easily use the shift method to subtract between rows.


2 Answers

Given the following dataframe:

df = pd.DataFrame([["Australia", 1, 3, 5],                    ["Bambua", 12, 33, 56],                    ["Tambua", 14, 34, 58]                   ], columns=["Country", "Val1", "Val2", "Val10"]                  ) 

It comes down to a simple broadcasting operation:

>>> df["Val1"] - df["Val10"] 0    -4 1   -44 2   -44 dtype: int64 
like image 166
Alberto Chiusole Avatar answered Sep 18 '22 12:09

Alberto Chiusole


Using this as the df:

df = pd.DataFrame([["Australia", 1, 3, 5],                ["Bambua", 12, 33, 56],                ["Tambua", 14, 34, 58]               ], columns=["Country", "Val1", "Val2", "Val10"]              ) 

You can also do the subtraction and put it into a new column as follows.

>>>df['Val_Diff'] = df['Val10'] - df['Val1']      Country     Val1    Val2  Val10 Val_Diff 0   Australia   1       3      5    4 1   Bambua      12      33     56   44 2   Tambua      14      34     58   44 
like image 36
Henry Owens Avatar answered Sep 22 '22 12:09

Henry Owens