numpy.where() with 3 or more conditions

Tags:

I have a dataframe with multiple columns.

      AC     BC     CC      DC     MyColumn

I would like to set a new column "MyColumn" where if BC, CC, and DC are less than AC, you take the max of the three for that row. If only CC and DC are less than AC, you take the max of CC and DC for that row, etc etc. If none of them are less than AC, MyColumn should just take the value from AC.

How would I do this with numpy.where()?

564

asked Mar 31 '14 17:03

mit13plee

1 Answers

You can use the lt method along with where:

In [11]: df = pd.DataFrame(np.random.randn(5, 4), columns=list('ABCD'))

In [12]: df
Out[12]:
          A         B         C         D
0  1.587878 -2.189620  0.631958 -0.432253
1 -1.636721  0.568846 -0.033618 -0.648406
2  1.567512  1.089788  0.489559  1.673372
3  0.589222 -1.176961 -1.186171  0.249795
4  0.366227  1.830107 -1.074298 -1.882093

Note: you can take max of a subset of columns:

In [13]: df[['B', 'C', 'D']].max(1)
Out[13]:
0    0.631958
1    0.568846
2    1.673372
3    0.249795
4    1.830107
dtype: float64

Look at each column's values to see if they are less than A:

In [14]: lt_A = df.lt(df['A'], axis=0)

In [15]: lt_A
Out[15]:
       A      B      C      D
0  False   True   True   True
1  False  False  False  False
2  False   True   True  False
3  False   True   True   True
4  False  False   True   True

In [15]: lt_A[['B', 'C', 'D']].all(1)
Out[15]:
0     True
1    False
2    False
3     True
4    False
dtype: bool

Now, you can build up your desired result using all:

In [16]: df[['B', 'C', 'D']].max(1).where(lt_A[['B', 'C', 'D']].all(1), 2)
Out[16]:
0    0.631958
1    2.000000
2    2.000000
3    0.249795
4    2.000000
dtype: float64

Rather than 2 you can insert first the Series (in this example it happens to be the same):

In [17]: df[['C', 'D']].max(1).where(lt_A[['C', 'D']].all(1), 2)
Out[17]:
0    0.631958
1    2.000000
2    2.000000
3    0.249795
4   -1.074298
dtype: float64

and then column A:

In [18]: df[['B', 'C', 'D']].max(1).where(lt_A[['B', 'C', 'D']].all(1), df[['C', 'D']].max(1).where(lt_A[['C', 'D']].all(1), df['A']))
Out[18]:
0    0.631958
1   -1.636721
2    1.567512
3    0.249795
4   -1.074298
dtype: float64

Clearly, you should write this as function if you're planning on reusing!

154

answered Oct 19 '22 23:10

Andy Hayden

Related questions
                            
                                Most efficient way to exclude indexed rows in pandas dataframe
                            
                                Replacing punctuation in a data frame based on punctuation list [duplicate]
                            
                                String Subsequence Kernel and SVM using Python
                            
                                Key Value For Loop Template - Django Inquiry
                            
                                String comparison doesn't seem to work for lines read from a file
                            
                                Cartopy behavior when plotting projected data
                            
                                Get data from Twitter using Tweepy and store in csv file
                            
                                No lighting in VTK
                            
                                How to install either pybluez or LightBlue on OSX 10.9 (Mavericks‎)
                            
                                Compare two lists, dictionaries in easy way
                            
                                When does Python 2 consider one function "greater than" or "less than" another function? [duplicate]
                            
                                What is the best way to convert a printed list in Python back into an actual list
                            
                                Is it possible to get formal argument names/values as a dictionary?
                            
                                Is it a bad idea to define a local class inside a function in python?
                            
                                how to convert ndarray to image and display it using python
                            
                                Curly Braces in python Popen
                            
                                Overwriting specific cells in netcdf
                            
                                How do I switch layouts in a window using PyQt?? (Without closing/opening windows)
                            
                                Increase/Decrease Play Speed of a WAV file Python
                            
                                Python unimplemented methods versus abstract methods, which is more pythonic? PyCharm doesn't like methods not implemented in the base class

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

numpy.where() with 3 or more conditions

Tags:

python

pandas

numpy

where

mit13plee

People also ask

1 Answers

Andy Hayden

Recent Activity

Donate For Us