id vi dates f_id
0 5532714 0.549501 2015-07-07 ff_22
1 5532715 0.540969 2015-07-08 ff_22
2 5532716 0.531477 2015-07-09 ff_22
3 5532717 0.521029 2015-07-10 ff_22
4 5532718 0.509694 2015-07-11 ff_22
In the dataframe above, I want to find average yearly value for each year. This does not work:
df.groupby(df.dates.year)['vi'].transform(mean)
I get this error: *** AttributeError: 'Series' object has no attribute 'year'
How to fix this?
Let's make sure that dates is datetime dtype, then use the .dt
accessor as .dt.year
:
df['dates'] = pd.to_datetime(df.dates)
df.groupby(df.dates.dt.year)['vi'].transform('mean')
Output:
0 0.530534
1 0.530534
2 0.530534
3 0.530534
4 0.530534
Name: vi, dtype: float64
Updating and completing @piRsquared's example below for recent versions of pandas
(e.g. v1.1.0), using the Grouper
function instead of TimeGrouper
which was deprecated:
import pandas as pd
import numpy as np
tidx = pd.date_range('2010-01-01', '2013-12-31', name='dates')
np.random.seed([3,1415])
df = pd.DataFrame(dict(vi=np.random.rand(tidx.size)), tidx)
df.groupby(pd.Grouper(freq='1Y')).mean()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With