I am trying to bin a Pandas DataFrame into three day windows. I have two columns, A and B, which I want to sum in each window. This code which I wrote for the task
df = df.groupby(df.index // 3).agg({'A': 'sum', 'B':'sum'})
Converts NaN values to zero when doing this sum, but I would like them to remain NaN as my data has actual non-NaN zero values.
For example if I had this df:
df = pd.DataFrame([
[np.nan, np.nan],
[np.nan, 0],
[np.nan, np.nan],
[2, 0],
[4 , 0],
[0 , 0]
], columns=['A','B'])
Index A B
0 NaN Nan
1 NaN 3
2 NaN Nan
3 2 0
4 4 0
5 0 0
I would like the new df to be:
Index A B
0 NaN 3
1 6 0
But my current code outputs:
Index A B
0 0 3
1 6 0
df.groupby(df.index // 3)['A', 'B'].mean()
The above snippet provides the mentioned sample output.
If you want to go for the sum, look at df.groupby(df.index // 3)['A', 'B'].sum(min_count = 1)
Another option:
df.groupby(df.index // 3).agg({'A': lambda x: x.sum(skipna=False),
'B':lambda x: x.sum(skipna=True)})
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With