I have two columns A and B. I want to subtract column B value with every value in column A and create a new column without using for-loop.
Below is my Dataframe
A B
0 5 3
1 3 2
2 8 1
Desired output
A B C D E
0 5 3 2 3 4
1 3 2 0 1 2
2 8 1 5 6 7
C = A - B[0]
D = A - B[1]
E = A - B[2]
Using numpy's array broadcasting:
df = pd.DataFrame({'A':[5, 3, 8],
'B':[3, 2, 1]})
df2 = pd.DataFrame(df['A'].values[:, None] - df['B'].values, columns=['C', 'D', 'E'])
df = df.join(df2)
Result:
A B C D E
0 5 3 2 3 4
1 3 2 0 1 2
2 8 1 5 6 7
Explanation:
>>> df['A'].values[:, None]
array([[5],
[3],
[8]])
>>> df['B'].values
array([3, 2, 1])
When subtracting them, numpy "stretches" df['A'].values[:, None] to:
array([[5, 5, 5],
[3, 3, 3],
[8, 8, 8]])
and df['B'].values to:
array([[3, 2, 1],
[3, 2, 1],
[3, 2, 1]])
and the result of subtraction is:
array([[2, 3, 4],
[0, 1, 2],
[5, 6, 7]])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With