consider the <code>pd.Series</code> <code>s</code> <pre class="prettyprint"><code>import pandas as pd import numpy as np np.random.seed([3,1415]) s = pd.Series(np.random.randint(0, 10, 10), list('abcdefghij')) s a 0 b 2 c 7 d 3 e 8 f 7 g 0 h 6 i 8 j 6 dtype: int64 </code></pre> I want to get the index for the max value for the rolling window of 3 <pre class="prettyprint"><code>s.rolling(3).max() a NaN b NaN c 7.0 d 7.0 e 8.0 f 8.0 g 8.0 h 7.0 i 8.0 j 8.0 dtype: float64 </code></pre> What I want is <pre class="prettyprint"><code>a None b None c c d c e e f e g e h f i i j i dtype: object </code></pre> What I've done <pre class="prettyprint"><code>s.rolling(3).apply(np.argmax) a NaN b NaN c 2.0 d 1.0 e 2.0 f 1.0 g 0.0 h 0.0 i 2.0 j 1.0 dtype: float64 </code></pre> which is obviously not what I want

I used a generator <pre class="prettyprint"><code>def idxmax(s, w): i = 0 while i + w <= len(s): yield(s.iloc[i:i+w].idxmax()) i += 1 pd.Series(idxmax(s, 3), s.index[2:]) c c d c e e f e g e h f i i j i dtype: object </code></pre>

how do I calculate a rolling idxmax

Tags:

python

pandas

dataframe

numpy

series

consider the pd.Series s

import pandas as pd
import numpy as np

np.random.seed([3,1415])
s = pd.Series(np.random.randint(0, 10, 10), list('abcdefghij'))
s

a    0
b    2
c    7
d    3
e    8
f    7
g    0
h    6
i    8
j    6
dtype: int64

I want to get the index for the max value for the rolling window of 3

s.rolling(3).max()

a    NaN
b    NaN
c    7.0
d    7.0
e    8.0
f    8.0
g    8.0
h    7.0
i    8.0
j    8.0
dtype: float64

What I want is

a    None
b    None
c       c
d       c
e       e
f       e
g       e
h       f
i       i
j       i
dtype: object

What I've done

s.rolling(3).apply(np.argmax)

a    NaN
b    NaN
c    2.0
d    1.0
e    2.0
f    1.0
g    0.0
h    0.0
i    2.0
j    1.0
dtype: float64

which is obviously not what I want

897

asked Oct 18 '16 06:10

piRSquared

2 Answers

There is no simple way to do that, because the argument that is passed to the rolling-applied function is a plain numpy array, not a pandas Series, so it doesn't know about the index. Moreover, the rolling functions must return a float result, so they can't directly return the index values if they're not floats.

Here is one approach:

>>> s.index[s.rolling(3).apply(np.argmax)[2:].astype(int)+np.arange(len(s)-2)]
Index([u'c', u'c', u'e', u'e', u'e', u'f', u'i', u'i'], dtype='object')

The idea is to take the argmax values and align them with the series by adding a value indicating how far along in the series we are. (That is, for the first argmax value we add zero, because it is giving us the index into a subsequence starting at index 0 in the original series; for the second argmax value we add one, because it is giving us the index into a subsequence starting at index 1 in the original series; etc.)

This gives the correct results, but doesn't include the two "None" values at the beginning; you'd have to add those back manually if you wanted them.

There is an open pandas issue to add rolling idxmax.

answered Oct 17 '22 18:10

BrenBarn

I used a generator

def idxmax(s, w):
    i = 0
    while i + w <= len(s):
        yield(s.iloc[i:i+w].idxmax())
        i += 1

pd.Series(idxmax(s, 3), s.index[2:])

c    c
d    c
e    e
f    e
g    e
h    f
i    i
j    i
dtype: object

answered Oct 17 '22 19:10

piRSquared

Related questions
                            
                                unsafe use of relative rpath libboost.dylib when making boost.python helloword demo?
                            
                                Add a dictionary to a `set()` with union
                            
                                passing arguments to functions in python using argv
                            
                                Drop pandas dataframe row based on max value of a column
                            
                                Convert tuple-strings to tuple of strings
                            
                                AttributeError: 'module' object has no attribute 'cbook'
                            
                                pandas plot dataframe as multiple bar charts
                            
                                creating a new line on a textbox in tkinter
                            
                                Group DataFrame in 5-minute intervals
                            
                                How use line.rstrip() in Python?
                            
                                Anaconda Python install imutils in Windows10
                            
                                Transposing (pivoting) a dict of lists in python [duplicate]
                            
                                Can't execute Python Pandas set_value
                            
                                sklearn: calculating accuracy score of k-means on the test data set
                            
                                How to create a unit test to check the response of an API made in Flask? [duplicate]
                            
                                Using IF, AND, OR together with EQUAL operand together in Python [duplicate]
                            
                                Python: String replace index
                            
                                Error in pip install matplotlib in Mac
                            
                                Logical Or/bitwise OR in pandas Data Frame
                            
                                Read in the first column of a CSV in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

how do I calculate a rolling idxmax

Tags:

python

pandas

dataframe

numpy

series

piRSquared

People also ask

2 Answers

BrenBarn

piRSquared

Recent Activity

Donate For Us