Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Index of last occurrence of max before min

The title might not be intuitive--let me provide an example. Say I have df, created with:

a = np.array([[ 1. ,  0.9,  1. ],
              [ 0.9,  0.9,  1. ],
              [ 0.8,  1. ,  0.5],
              [ 1. ,  0.3,  0.2],
              [ 1. ,  0.2,  0.1],
              [ 0.9,  1. ,  1. ],
              [ 1. ,  0.9,  1. ],
              [ 0.6,  0.9,  0.7],
              [ 1. ,  0.9,  0.8],
              [ 1. ,  0.8,  0.9]])

idx = pd.date_range('2017', periods=a.shape[0])
df = pd.DataFrame(a, index=idx, columns=list('abc'))

I can get the index location of each respective column minimum with

df.idxmin()

Now, how could I get the location of the last occurrence of the column-wise maximum, up to the location of the minimum?

Visually, I want to find the location of the green max's below:

enter image description here

where the max's after the minimum occurrence are ignored.

I can do this with .apply, but can it be done with a mask/advanced indexing?

Desired result:

a   2017-01-07
b   2017-01-03
c   2017-01-02
dtype: datetime64[ns]
like image 754
Brad Solomon Avatar asked Dec 19 '17 17:12

Brad Solomon


1 Answers

Apply a mask and then call idxmax on the reversed dataframe.

df.mask((df == df.min()).cumsum().astype(bool))[::-1].idxmax()

a   2017-01-07
b   2017-01-03
c   2017-01-02
dtype: datetime64[ns]

Details

First, identify the location of the smallest items per column.

df.min()

a    0.6
b    0.2
c    0.1
dtype: float64

i = df == df.min()
i

                a      b      c
2017-01-01  False  False  False
2017-01-02  False  False  False
2017-01-03  False  False  False
2017-01-04  False  False  False
2017-01-05  False   True   True
2017-01-06  False  False  False
2017-01-07  False  False  False
2017-01-08   True  False  False
2017-01-09  False  False  False
2017-01-10  False  False  False

Now, mask those values and beyond!

j = df.mask(i).cumsum().astype(bool))
j

              a    b    c
2017-01-01  1.0  0.9  1.0
2017-01-02  0.9  0.9  1.0
2017-01-03  0.8  1.0  0.5
2017-01-04  1.0  0.3  0.2
2017-01-05  1.0  NaN  NaN
2017-01-06  0.9  NaN  NaN
2017-01-07  1.0  NaN  NaN
2017-01-08  NaN  NaN  NaN
2017-01-09  NaN  NaN  NaN
2017-01-10  NaN  NaN  NaN

To find the last maximum, just reverse and call idxmax.

j[::-1].idxmax()

a   2017-01-07
b   2017-01-03
c   2017-01-02
dtype: datetime64[ns]
like image 147
cs95 Avatar answered Oct 22 '22 09:10

cs95