Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

.agg Sum Converting NaN to 0

Tags:

python

pandas

I am trying to bin a Pandas DataFrame into three day windows. I have two columns, A and B, which I want to sum in each window. This code which I wrote for the task

    df = df.groupby(df.index // 3).agg({'A': 'sum', 'B':'sum'})

Converts NaN values to zero when doing this sum, but I would like them to remain NaN as my data has actual non-NaN zero values.

For example if I had this df:

df = pd.DataFrame([
     [np.nan, np.nan],
     [np.nan, 0],
     [np.nan, np.nan],
     [2,   0],
     [4 ,  0],
     [0  , 0]
], columns=['A','B'])

Index A   B
0     NaN Nan
1     NaN 3
2     NaN Nan
3     2   0
4     4   0
5     0   0

I would like the new df to be:

Index A   B
0     NaN 3
1     6   0

But my current code outputs:

Index A   B
0     0   3
1     6   0
like image 867
taurus Avatar asked Nov 07 '22 14:11

taurus


1 Answers

df.groupby(df.index // 3)['A', 'B'].mean()

The above snippet provides the mentioned sample output.

If you want to go for the sum, look at df.groupby(df.index // 3)['A', 'B'].sum(min_count = 1)

Another option:

df.groupby(df.index // 3).agg({'A': lambda x: x.sum(skipna=False),
                           'B':lambda x: x.sum(skipna=True)})
like image 60
Karan Arya Avatar answered Nov 14 '22 21:11

Karan Arya