I have a Series that contains scalar values indexes by days over several years. For some years there are not data.
2014-10-07 5036.883410
2013-10-11 5007.515654
2013-10-27 5020.184053
2014-09-12 5082.379630
2014-10-14 5032.669801
2014-10-30 5033.276159
2016-10-03 5046.921912
2016-10-19 5141.861889
2017-10-06 5266.138810
From this I want to get 1. the maximum for each year 2. the day of the maximum for each year For those years where there are not data, there should be a nan.
To resolve 1. the following works:
import pandas as pd
import numpy as np
data= pd.Series( index=pd.DatetimeIndex(['2014-10-07', '2013-10-11', '2013-10-27', '2014-09-12', '2014-10-14', '2014-10-30', '2016-10-03', '2016-10-19', '2017-10-06'], dtype='datetime64[ns]', name='time', freq=None), data=np.array([5036.88341035, 5007.51565355, 5020.18405295, 5082.37963023, 5032.66980146, 5033.27615931, 5046.92191246, 5141.86188915, 5266.1388102 ]))
# get maximum of each year
data.resample('A').max()
However, I tried different options to get the index of the date with the maximum, but they all failed:
data.resample('A').idxmax()
This raises the following Attribute error:
AttributeError: 'DatetimeIndexResampler' object has no attribute 'idxmax'
Then I tried the following:
data.groupby(pd.TimeGrouper('A')).idxmax()
but this gave an ValueError without specification. I then found this workaround:
data.groupby(pd.TimeGrouper('A')).agg( lambda x : x.idxmax() )
but I did not wore either for temporally grouped data:
ValueError: attempt to get argmax of an empty sequence
Apparently the reported bug has not been fixed yet and the suggested workaround for categorical data does not seem to work for temporally grouped/resampled data.
Can anyone provide a suitable workaround for this case or maybe an entirely different (and efficient) solution approach to the above problem?
Thanks in advance!
The problem is that you have no records during 2015, but a time period for 2015 is created since it is inside your years' range. You need to manually process this case:
data.resample('A').agg(
lambda x : np.nan if x.count() == 0 else x.idxmax()
)
Output:
time
2013-12-31 2013-10-27
2014-12-31 2014-09-12
2015-12-31 NaT
2016-12-31 2016-10-19
2017-12-31 2017-10-06
Freq: A-DEC, dtype: datetime64[ns]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With