I have a dataframe with sporadic dates as the index, and columns = 'id' and 'num'. I would like to pd.groupby
the 'id' column, and apply the reindex to each group in the dataframe.
My sample dataset looks like this:
id num
2015-08-01 1 3
2015-08-05 1 5
2015-08-06 1 4
2015-07-31 2 1
2015-08-03 2 2
2015-08-06 2 3
My expected output once pd.reindex
with ffill
is:
id num
2015-08-01 1 3
2015-08-02 1 3
2015-08-03 1 3
2015-08-04 1 3
2015-08-05 1 5
2015-08-06 1 4
2015-07-31 2 1
2015-08-01 2 1
2015-08-02 2 1
2015-08-03 2 2
2015-08-04 2 2
2015-08-05 2 2
2015-08-06 2 3
I have tried this, among other things to no avail:
newdf=df.groupby('id').reindex(method='ffill')
Which returns error:AttributeError: Cannot access callable attribute 'reindex' of 'DataFrameGroupBy' objects, try using the 'apply' method
Any help would be much appreciated
In order to reset the index after groupby() we will use the reset_index() function.
How to perform groupby index in pandas? Pass index name of the DataFrame as a parameter to groupby() function to group rows on an index. DataFrame. groupby() function takes string or list as a param to specify the group columns or index.
One can reindex a single column or multiple columns by using reindex() method and by specifying the axis we want to reindex. Default values in the new index that are not present in the dataframe are assigned NaN.
Groupby preserves the order of rows within each group.
There's probably a slicker way to do this but this works:
def reindex_by_date(df):
dates = pd.date_range(df.index.min(), df.index.max())
return df.reindex(dates).ffill()
df.groupby('id').apply(reindex_by_date).reset_index(0, drop=True)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With