I have a dataframe which looks like:
name date value
0 a 2020-01-01 1
1 a 2020-01-03 1
2 a 2020-01-05 1
3 b 2020-01-02 1
4 b 2020-01-03 1
5 b 2020-01-04 1
6 b 2020-01-05 1
where the values were calculate by value_df = df.groupby(['name', 'date'], as_index=False).value.sum()
how can I make it to following:
name date value
0 a 2020-01-01 1
1 a 2020-01-02 1
2 a 2020-01-03 1
3 a 2020-01-04 1
4 a 2020-01-05 1
5 b 2020-01-01 1
6 b 2020-01-02 1
7 b 2020-01-03 1
8 b 2020-01-04 1
9 b 2020-01-05 1
I tried
date_index = pd.date_range(start=min(df['date']), end=max(df['date']))
value_df['value'] = pd.Series(value_df['value'])
value_df.reindex(date_index)
Which has made no difference.
Try pivoting and then stack:
date_index = pd.date_range(start=df['date'].min(), end=df['date'].max())
(df.pivot_table('value','name','date',fill_value=1)
.reindex(date_index,axis=1).reset_index().melt('name',var_name='date'))
Or:
(df.pivot_table('value','name','date',fill_value=1)
.reindex(date_index,axis=1).stack().reset_index(name='value'))
name date value
0 a 2020-01-01 1
1 a 2020-01-02 1
2 a 2020-01-03 1
3 a 2020-01-04 1
4 a 2020-01-05 1
5 b 2020-01-01 1
6 b 2020-01-02 1
7 b 2020-01-03 1
8 b 2020-01-04 1
9 b 2020-01-05 1
We can do pivot
then stack
s=df.pivot(*df.columns).ffill().bfill().stack().to_frame('value').reset_index()
Out[199]:
name date value
0 a 2020-01-01 1.0
1 a 2020-01-02 1.0
2 a 2020-01-03 1.0
3 a 2020-01-04 1.0
4 a 2020-01-05 1.0
5 b 2020-01-01 1.0
6 b 2020-01-02 1.0
7 b 2020-01-03 1.0
8 b 2020-01-04 1.0
9 b 2020-01-05 1.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With