I have a <code>dataframe</code> and would like to subtract two columns of the previous row, provided that the previous row has the same <code>Name</code> value. If it does not, then I would like it yield <code>NAN</code> and fill with <code>-</code>. My <code>groupby</code> expression yields the error, <code>TypeError: 'Series' objects are mutable, thus they cannot be hashed</code>, which is very ambiguous. What am I missing? <pre class="prettyprint"><code>import pandas as pd df = pd.DataFrame(data=[['Person A', 5, 8], ['Person A', 13, 11], ['Person B', 11, 32], ['Person B', 15, 20]], columns=['Names', 'Value', 'Value1']) df['diff'] = df.groupby('Names').apply(df['Value'].shift(1) - df['Value1'].shift(1)).fillna('-') print df </code></pre> Desired Output: <pre class="prettyprint"><code> Names Value Value1 diff 0 Person A 5 8 - 1 Person A 13 11 -3 2 Person B 11 32 - 3 Person B 15 20 -21 </code></pre>

You can add <code>lambda x</code> and change <code>df['Value']</code> to <code>x['Value']</code>, similar with <code>Value1</code> and last <code>reset_index</code>: <pre class="prettyprint"><code>df['diff'] = df.groupby('Names') .apply(lambda x: x['Value'].shift(1) - x['Value1'].shift(1)) .fillna('-') .reset_index(drop=True) print (df) Names Value Value1 diff 0 Person A 5 8 - 1 Person A 13 11 -3 2 Person B 11 32 - 3 Person B 15 20 -21 </code></pre> Another solution with <code>DataFrameGroupBy.shift</code>: <pre class="prettyprint"><code>df1 = df.groupby('Names')['Value','Value1'].shift() print (df1) Value Value1 0 NaN NaN 1 5.0 8.0 2 NaN NaN 3 11.0 32.0 df['diff'] = (df1.Value - df1.Value1).fillna('-') print (df) Names Value Value1 diff 0 Person A 5 8 - 1 Person A 13 11 -3 2 Person B 11 32 - 3 Person B 15 20 -21 </code></pre>

Subtracting Two Columns with a Groupby in Pandas

I have a dataframe and would like to subtract two columns of the previous row, provided that the previous row has the same Name value. If it does not, then I would like it yield NAN and fill with -. My groupby expression yields the error, TypeError: 'Series' objects are mutable, thus they cannot be hashed, which is very ambiguous. What am I missing?

import pandas as pd
df = pd.DataFrame(data=[['Person A', 5, 8], ['Person A', 13, 11], ['Person B', 11, 32], ['Person B', 15, 20]], columns=['Names', 'Value', 'Value1'])
df['diff'] = df.groupby('Names').apply(df['Value'].shift(1) - df['Value1'].shift(1)).fillna('-')
print df

Desired Output:

      Names  Value  Value1  diff
0  Person A      5       8     -
1  Person A     13      11    -3
2  Person B     11      32     -
3  Person B     15      20   -21

How do I subtract two columns in pandas?

We can create a function specifically for subtracting the columns, by taking column data as arguments and then using the apply method to apply it to all the data points throughout the column.

Can you Groupby two columns pandas?

Pandas comes with a whole host of sql-like aggregation functions you can apply when grouping on one or more columns. This is Python's closest equivalent to dplyr's group_by + summarise logic.

How do you subtract two data sets in pandas?

subtract() function is used for finding the subtraction of dataframe and other, element-wise. This function is essentially same as doing dataframe – other but with a support to substitute for missing data in one of the inputs.

You can add lambda x and change df['Value'] to x['Value'], similar with Value1 and last reset_index:

df['diff'] = df.groupby('Names')
               .apply(lambda x: x['Value'].shift(1) - x['Value1'].shift(1))
               .fillna('-')
               .reset_index(drop=True)
print (df)
      Names  Value  Value1 diff
0  Person A      5       8    -
1  Person A     13      11   -3
2  Person B     11      32    -
3  Person B     15      20  -21

Another solution with DataFrameGroupBy.shift:

df1 = df.groupby('Names')['Value','Value1'].shift()
print (df1)
   Value  Value1
0    NaN     NaN
1    5.0     8.0
2    NaN     NaN
3   11.0    32.0
df['diff'] = (df1.Value - df1.Value1).fillna('-')

print (df)
      Names  Value  Value1 diff
0  Person A      5       8    -
1  Person A     13      11   -3
2  Person B     11      32    -
3  Person B     15      20  -21

you can also do it this way:

In [76]: df['diff'] = (-df.groupby('Names')[['Value1','Value']].shift(1).diff(axis=1)['Value1']).fillna(0)

In [77]: df
Out[77]:
      Names  Value  Value1  diff
0  Person A      5       8   0.0
1  Person A     13      11  -3.0
2  Person B     11      32   0.0
3  Person B     15      20 -21.0

Subtracting Two Columns with a Groupby in Pandas

Tags:

python

pandas

python-2.7

user2242044

People also ask

2 Answers

jezrael

MaxU - stop WAR against UA

Recent Activity

Donate For Us

Subtracting Two Columns with a Groupby in Pandas

Tags:

python

pandas

python-2.7

user2242044

People also ask

2 Answers

jezrael

MaxU - stop WAR against UA

Related questions

Recent Activity

Donate For Us