Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas multiple conditions based on multiple columns using np.where

I am trying to color points of a pandas dataframe depending on TWO conditions. Example:

IF value of col1 > a AND value of col2 - value of col3 < b THEN value of col4 = string
ELSE value of col4 = other string.

I have tried so many different ways now and everything I found online was only depending on one condition.

My example code always raises the Error:

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Here's the code. Tried several variations without success.

df = pd.DataFrame()

df['A'] = range(10)
df['B'] = range(11,21,1)
df['C'] = range(20,10,-1)

borderE = 3.
ex = 0.

#print df

df['color'] = np.where(all([df.A < borderE, df.B - df.C < ex]), 'r', 'b')

Btw: I understand, what it says but not how to handle it.

like image 875
Robert Avatar asked Apr 13 '16 15:04

Robert


People also ask

Can NP Where have multiple conditions?

Python NumPy where() is used to get an array with selected elements from the existing array by checking single or multiple conditions. It returns the indices of the array for with each condition being True.

How do I use multiple conditions in pandas?

Using Loc to Filter With Multiple Conditions The loc function in pandas can be used to access groups of rows or columns by label. Add each condition you want to be included in the filtered result and concatenate them with the & operator. You'll see our code sample will return a pd. dataframe of our filtered rows.

Can I nest NP Where?

We can use nested np. where() condition checks ( like we do for CASE THEN condition checking in other languages).


2 Answers

Selection criteria uses Boolean indexing:

df['color'] = np.where(((df.A < borderE) & ((df.B - df.C) < ex)), 'r', 'b')

>>> df
   A   B   C color
0  0  11  20     r
1  1  12  19     r
2  2  13  18     r
3  3  14  17     b
4  4  15  16     b
5  5  16  15     b
6  6  17  14     b
7  7  18  13     b
8  8  19  12     b
9  9  20  11     b
like image 180
Alexander Avatar answered Oct 01 '22 19:10

Alexander


wrap the IF in a function and apply it:

def color(row):
    borderE = 3.
    ex = 0.
    if (row.A > borderE) and( row.B - row.C < ex) :
        return "somestring"
    else:
        return "otherstring"

df.loc[:, 'color'] = df.apply(color, axis = 1)

Yields:

  A   B   C        color
0  0  11  20  otherstring
1  1  12  19  otherstring
2  2  13  18  otherstring
3  3  14  17  otherstring
4  4  15  16   somestring
5  5  16  15  otherstring
6  6  17  14  otherstring
7  7  18  13  otherstring
8  8  19  12  otherstring
9  9  20  11  otherstring
like image 43
Sam Avatar answered Oct 01 '22 20:10

Sam