Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most recent max/min value

Tags:

python

pandas

I have the following dataframe:

date          value
2014-01-20    10
2014-01-21    12
2014-01-22    13
2014-01-23    9
2014-01-24    7
2014-01-25    12
2014-01-26    11

I need to be able to keep track of when the latest maximum and minimum value occurred within a specific rolling window. For example if I were to use a rolling window period of 5, then I would need an output like the following:

date          value   rolling_max_date    rolling_min_date
2014-01-20    10      2014-01-20          2014-01-20
2014-01-21    12      2014-01-21          2014-01-20
2014-01-22    13      2014-01-22          2014-01-20
2014-01-23    9       2014-01-22          2014-01-23
2014-01-24    7       2014-01-22          2014-01-24
2014-01-25    12      2014-01-22          2014-01-24
2014-01-26    11      2014-01-25          2014-01-24

All this shows is, what is the date of the latest maximum and minimum value within the rolling window. I know pandas has rolling_min and rolling_max, but im not sure how to keep track of the index/date of when the most recent max/min occured within the window.

like image 498
darkpool Avatar asked Aug 10 '15 13:08

darkpool


People also ask

Which is higher Min or Max?

The min is simply the lowest observation, while the max is the highest observation.

What is the max min value?

The largest value in a data set is often called the maximum (or max for short), and the smallest value is called the minimum (or min). The difference between the maximum and minimum value is sometimes called the range and is calculated by subtracting the smallest value from the largest value.

What is MIN () and MAX ()?

Calling min() and max() With a Single Iterable Argument In these examples, you call min() and max() with a list of integer numbers and then with an empty list. The first call to min() returns the smallest number in the input list, -5 . In contrast, the first call to max() returns the largest number in the list, or 9 .

How do you find the max and min of a column in R?

In R, we can find the minimum or maximum value of a vector or data frame. We use the min() and max() function to find minimum and maximum value respectively. The min() function returns the minimum value of a vector or data frame. The max() function returns the maximum value of a vector or data frame.


2 Answers

There is a more general rolling_apply where you can provide your own function. However, the custom functions receives the windows as arrays, not dataframes, so the index information is not available (so you cannot use idxmin/max).

But lets try to achieve this in two steps:

In [41]: df = df.set_index('date')
In [42]: pd.rolling_apply(df, window=5, func=lambda x: x.argmin(), min_periods=1)
Out[42]:
            value
date
2014-01-20      0
2014-01-21      0
2014-01-22      0
2014-01-23      3
2014-01-24      4
2014-01-25      3
2014-01-26      2

This gives you the index in the window where the minimum is found. But, this index is for that particular window and not for the entire dataframe. So let's add the start of the window, and then use this integer location to retrieve the correct index locations index:

In [45]: ilocs_window = pd.rolling_apply(df, window=5, func=lambda x: x.argmin(), min_periods=1)

In [46]: ilocs = ilocs_window['value'] + ([0, 0, 0, 0] + range(len(ilocs_window)-4))

In [47]: ilocs
Out[47]:
date
2014-01-20    0
2014-01-21    0
2014-01-22    0
2014-01-23    3
2014-01-24    4
2014-01-25    4
2014-01-26    4
Name: value, dtype: float64

In [48]: df.index.take(ilocs)
Out[48]:
Index([u'2014-01-20', u'2014-01-20', u'2014-01-20', u'2014-01-23',
       u'2014-01-24', u'2014-01-24', u'2014-01-24'],
      dtype='object', name=u'date')

In [49]: df['rolling_min_date'] = df.index.take(ilocs)

In [50]: df
Out[50]:
            value rolling_min_date
date
2014-01-20     10       2014-01-20
2014-01-21     12       2014-01-20
2014-01-22     13       2014-01-20
2014-01-23      9       2014-01-23
2014-01-24      7       2014-01-24
2014-01-25     12       2014-01-24
2014-01-26     11       2014-01-24

The same can be done for the maximum:

ilocs_window = pd.rolling_apply(df, window=5, func=lambda x: x.argmax(), min_periods=1)
ilocs = ilocs_window['value'] + ([0, 0, 0, 0] + range(len(ilocs_window)-4))
df['rolling_max_date'] = df.index.take(ilocs)
like image 191
joris Avatar answered Oct 04 '22 21:10

joris


Here is a workaround.

import pandas as pd
import numpy as np

# sample data
# ===============================================
np.random.seed(0)
df = pd.DataFrame(np.random.randint(1,30,20), index=pd.date_range('2015-01-01', periods=20, freq='D'), columns=['value'])
df

            value
2015-01-01     13
2015-01-02     16
2015-01-03     22
2015-01-04      1
2015-01-05      4
2015-01-06     28
2015-01-07      4
2015-01-08      8
2015-01-09     10
2015-01-10     20
2015-01-11     22
2015-01-12     19
2015-01-13      5
2015-01-14     24
2015-01-15      7
2015-01-16     25
2015-01-17     25
2015-01-18     13
2015-01-19     27
2015-01-20      2

# processing
# ==========================================
# your cumstom function to track on max/min value/date
def track_minmax(df):
    return pd.Series({'current_date': df.index[-1], 'rolling_max_val': df['value'].max(), 'rolling_max_date': df['value'].idxmax(), 'rolling_min_val': df['value'].min(), 'rolling_min_date': df['value'].idxmin()})

window = 5
# use list comprehension to do the for loop
pd.DataFrame([track_minmax(df.iloc[i:i+window]) for i in range(len(df)-window+1)]).set_index('current_date').reindex(df.index)

           rolling_max_date  rolling_max_val rolling_min_date  rolling_min_val
2015-01-01              NaT              NaN              NaT              NaN
2015-01-02              NaT              NaN              NaT              NaN
2015-01-03              NaT              NaN              NaT              NaN
2015-01-04              NaT              NaN              NaT              NaN
2015-01-05       2015-01-03               22       2015-01-04                1
2015-01-06       2015-01-06               28       2015-01-04                1
2015-01-07       2015-01-06               28       2015-01-04                1
2015-01-08       2015-01-06               28       2015-01-04                1
2015-01-09       2015-01-06               28       2015-01-05                4
2015-01-10       2015-01-06               28       2015-01-07                4
2015-01-11       2015-01-11               22       2015-01-07                4
2015-01-12       2015-01-11               22       2015-01-08                8
2015-01-13       2015-01-11               22       2015-01-13                5
2015-01-14       2015-01-14               24       2015-01-13                5
2015-01-15       2015-01-14               24       2015-01-13                5
2015-01-16       2015-01-16               25       2015-01-13                5
2015-01-17       2015-01-16               25       2015-01-13                5
2015-01-18       2015-01-16               25       2015-01-15                7
2015-01-19       2015-01-19               27       2015-01-15                7
2015-01-20       2015-01-19               27       2015-01-20                2
like image 31
Jianxun Li Avatar answered Oct 04 '22 21:10

Jianxun Li