Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating new column in Pandas with a condition based on existing row values and returning another row's values

Tags:

python

pandas

would like some help with the following problem. I currently have a panda dataframe with 3 columns - test1, test2, test3

What I hope to achieve is result in the result_column, where the logic will be:

1) If value in test1 AND test2 > 0, then return value of test3

2) Else If value test1 AND test2 < 0, then return NEGATIVE value of test3

3) Otherwise return 0

  test1  test2  test3  result_column
0    0.5    0.1   1.25    1.25
1    0.2   -0.2   0.22       0
2   -0.3   -0.2   1.12   -1.12
3    0.4   -0.3   0.34       0
4    0.5      0   0.45       0

This is my first time posting a question on python and pandas. Apologies in advance if the formatting here is not optimum. Appreciate any help I can get!

like image 333
Singapore 123 Avatar asked Mar 07 '23 16:03

Singapore 123


1 Answers

I think need numpy.select with conditions chained by & (AND) or select all tested columns by subset [[]], compare ant test by DataFrame.all:

m1 = (df.test1 > 0) & (df.test2 > 0)
#alternative
#m1 = (df[['test1', 'test2']] > 0).all(axis=1)

m2 = (df.test1 < 0) & (df.test2 < 0)
#alternative
#m2 = (df[['test1', 'test2']] < 0).all(axis=1)

df['result_column'] = np.select([m1,m2], [df.test3, -df.test3], default=0)
print (df)
   test1  test2  test3  result_column
0    0.5    0.1   1.25           1.25
1    0.2   -0.2   0.22           0.00
2   -0.3   -0.2   1.12          -1.12
3    0.4   -0.3   0.34           0.00
4    0.5    0.0   0.45           0.00
like image 85
jezrael Avatar answered Mar 09 '23 06:03

jezrael