I have the following dataframe.
df = pd.DataFrame([['a', 4], ['b', 1], ['c', 2], ['d', 0], ], columns=['item', 'value'])
df
item | value
a | 4
b | 1
c | 2
d | 0
I want to calculate the pairwise absolute difference between each possible pair of item to give the following output.
item| a | b | c | d
a | 0.0 | 3.0 | 2.0 | 4.0
b | 3.0 | 0.0 | 1.0 | 1.0
c | 2.0 | 1.0 | 0.0 | 2.0
d | 4.0 | 1.0 | 2.0 | 0.0
After a lot of search, I could find answer only to direct element by element difference, which results in a single column output.
So far, I've tried
pd.pivot_table(df, values='value', index='item', columns='item', aggfunc=np.diff)
but this doesn't work.
To find the difference between any two columns in a pandas DataFrame, you can use the following syntax: df ['difference'] = df ['column1'] - df ['column2'] The following examples show how to use this syntax in practice. Example 1: Find Difference Between Two Columns
Use diff when you only care about the difference, and use shift when you care about retaining the values, such as when you want to calculate the percentage change between rows. In this final section, you’ll learn how to easily plot the differences between consecutive rows in a Pandas Dataframe.
Because of this, we can easily use the shift method to subtract between rows. The Pandas shift method offers a pre-step to calculating the difference between two rows by letting you see the data directly. The Pandas diff method simply calculates the difference, thereby abstracting the calculation.
DataFrame.diff(periods=1, axis=0) [source] ¶ First discrete difference of element. Calculates the difference of a Dataframe element compared with another element in the Dataframe (default is element in previous row).
This question has been answered here. The only difference is that you would need to add abs
:
abs(df['value'].values - df['value'].values[:, None])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With