I have a pandas dataframe with dates and strings similar to this:
Start End Note Item
2016-10-22 2016-11-05 Z A
2017-02-11 2017-02-25 W B
I need to expand/transform it to the below, filling in weeks (W-SAT) in between the Start and End columns and forward filling the data in Note and Items:
Start Note Item
2016-10-22 Z A
2016-10-29 Z A
2016-11-05 Z A
2017-02-11 W B
2017-02-18 W B
2017-02-25 W B
Whats the best way to do this with pandas? Some sort of multi-index apply?
Pandas DataFrame - expanding() functionThe expanding() function is used to provide expanding transformations. Minimum number of observations in window required to have a value (otherwise result is NA). Set the labels at the center of the window.
The append() method appends a DataFrame-like object at the end of the current DataFrame. The append() method returns a new DataFrame object, no changes are done with the original DataFrame.
In pandas you can add/append a new column to the existing DataFrame using DataFrame. insert() method, this method updates the existing DataFrame with a new column. DataFrame. assign() is also used to insert a new column however, this method returns a new Dataframe after adding a new column.
You can iterate over each row and create a new dataframe and then concatenate them together
pd.concat([pd.DataFrame({'Start': pd.date_range(row.Start, row.End, freq='W-SAT'),
'Note': row.Note,
'Item': row.Item}, columns=['Start', 'Note', 'Item'])
for i, row in df.iterrows()], ignore_index=True)
Start Note Item
0 2016-10-22 Z A
1 2016-10-29 Z A
2 2016-11-05 Z A
3 2017-02-11 W B
4 2017-02-18 W B
5 2017-02-25 W B
You don't need iteration at all.
df_start_end = df.melt(id_vars=['Note','Item'],value_name='date')
df = df_start_end.groupby('Note').apply(lambda x: x.set_index('date').resample('W').pad()).drop(columns=['Note','variable']).reset_index()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With