Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas column pairwise difference for each possible pair [duplicate]

I have the following dataframe.

df = pd.DataFrame([['a', 4], ['b', 1], ['c', 2], ['d', 0], ], columns=['item', 'value'])
df
item | value    
a    | 4
b    | 1
c    | 2
d    | 0 

I want to calculate the pairwise absolute difference between each possible pair of item to give the following output.

item| a     | b     | c     | d
a   | 0.0   | 3.0   | 2.0   | 4.0
b   | 3.0   | 0.0   | 1.0   | 1.0
c   | 2.0   | 1.0   | 0.0   | 2.0
d   | 4.0   | 1.0   | 2.0   | 0.0

After a lot of search, I could find answer only to direct element by element difference, which results in a single column output.

So far, I've tried

pd.pivot_table(df, values='value', index='item', columns='item', aggfunc=np.diff)

but this doesn't work.

like image 980
Thirupathi Thangavel Avatar asked Feb 05 '19 07:02

Thirupathi Thangavel


People also ask

How to find difference between two columns in a pandas Dataframe?

To find the difference between any two columns in a pandas DataFrame, you can use the following syntax: df ['difference'] = df ['column1'] - df ['column2'] The following examples show how to use this syntax in practice. Example 1: Find Difference Between Two Columns

When should I use diff or shift in pandas?

Use diff when you only care about the difference, and use shift when you care about retaining the values, such as when you want to calculate the percentage change between rows. In this final section, you’ll learn how to easily plot the differences between consecutive rows in a Pandas Dataframe.

How do you subtract between two rows in pandas?

Because of this, we can easily use the shift method to subtract between rows. The Pandas shift method offers a pre-step to calculating the difference between two rows by letting you see the data directly. The Pandas diff method simply calculates the difference, thereby abstracting the calculation.

What is Dataframe diff in Python?

DataFrame.diff(periods=1, axis=0) [source] ¶ First discrete difference of element. Calculates the difference of a Dataframe element compared with another element in the Dataframe (default is element in previous row).


Video Answer


1 Answers

This question has been answered here. The only difference is that you would need to add abs:

abs(df['value'].values - df['value'].values[:, None])
like image 169
Harm Avatar answered Sep 29 '22 20:09

Harm