Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

compare multiple columns of pandas dataframe with one column

I have a dataframe: df-

    A   B   C   D   E
0   V   10  5   18  20
1   W   9   18  11  13
2   X   8   7   12  5
3   Y   7   9   7   8
4   Z   6   5   3   90

I want to add a column 'Result' which should return 1 if the value in column 'E' is greater than the values in B, C & D columns else return 0.

Output should be:

    A   B   C   D   E   Result
0   V   10  5   18  20  1
1   W   9   18  11  13  0
2   X   8   7   12  5   0
3   Y   7   9   7   8   0
4   Z   6   5   3   90  1

For few columns, i would use logic like : if(and(E>B,E>C,E>D),1,0), But I have to compare around 20 columns (from B to U) with column name 'V'. Additionally, the dataframe has around 100 thousand rows.

I am using

df['Result']=np.where((df.ix[:,1:20])<df['V']).all(1),1,0)

And it gives a Memory error.

like image 871
piyush bansal Avatar asked Oct 19 '25 10:10

piyush bansal


1 Answers

One possible solution is compare in numpy and last convert boolean mask to ints:

df['Result'] = (df.iloc[:, 1:4].values < df[['E']].values).all(axis=1).astype(int)
print (df)
   A   B   C   D   E  Result
0  V  10   5  18  20       1
1  W   9  18  11  13       0
2  X   8   7  12   5       0
3  Y   7   9   7   8       0
4  Z   6   5   3  90       1
like image 178
jezrael Avatar answered Oct 21 '25 01:10

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!