My pandas dataframe consists of a categorical column JOB_TITLE, a numeric column BASE_SALARY and a datetime index JOIN_DATE. I'd like to perform an aggregation over the groups of the categorical and downsampled datetimes as follows:
# Resampled at frequency of start data of every 5 years
mean_agg = (df
.groupby('JOB_TITLE')
.resample('5AS')['BASE_SALARY']
.mean())
Unfortunately, as the groupby operation is occurring before the resample, the resample operation is performed independently for each JOB_TITLE group. This results in following Series:
| JOB_TITLE | JOIN_DATE | |
|-------------------|------------|-------|
| Data Scientist | 2004-01-01 | 60000 |
| | 2009-01-01 | 75000 |
| | 2014-01-01 | 90000 |
| | | |
| Software Engineer | 2001-01-01 | 70000 |
| | 2006-01-01 | 85000 |
| | 2011-01-01 | 90000 |
| | 2016-01-01 | 85000 |
As you can see the indexes at JOIN_DATE level for Data Scientist group and Software Engineer are not aligned. This creates a problem when you apply unstack for level JOB_TITLE as follows:
mean_agg.unstack('JOB_TITLE')
This results in the following dataframe:
| JOB_TITLE | Data Scientist | Software Engineer |
|------------|----------------|-------------------|
| JOIN_DATE | | |
| 2001-01-01 | NaN | 70000 |
| 2004-01-01 | 60000 | NaN |
| 2006-01-01 | NaN | 85000 |
| 2009-01-01 | 75000 | NaN |
| 2011-01-01 | NaN | 70000 |
| 2014-01-01 | 90000 | NaN |
| 2016-01-01 | NaN | 85000 |
How can I avoid this sequential operation of groupby and resample and instead perform a simultaneous operation? Thanks!
Update Pandas 0.21 answer: pd.TimeGrouper is getting deprecated, use pd.Grouper instead.
mean_agg = (df.groupby(['JOB_TITLE',pd.Grouper(freq='5AS')])['BASE_SALARY']
.mean())
mean_agg.unstack('JOB_TITLE')
Instead of using resample, let's try to use pd.TimeGrouper
mean_agg = (df
.groupby(['JOB_TITLE',pd.TimeGrouper(freq='5AS')])['BASE_SALARY']
.mean())
mean_agg.unstack('JOB_TITLE')
TimeGrouper aligns the bins of the grouped time range.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With