The dataframe is an empty df after query.when groupby,raise runtime waring,then get another empty dataframe with no columns.How to keep the columns?
df = pd.DataFrame(columns=["PlatformCategory","Platform","ResClassName","Amount"])
print df
result:
Empty DataFrame
Columns: [PlatformCategory, Platform, ResClassName, Amount]
Index: []
then groupby:
df = df.groupby(["PlatformCategory","Platform","ResClassName"]).sum()
df = df.reset_index(drop=False,inplace=True)
print df
result: sometimes is None sometime is empty dataframe
Empty DataFrame
Columns: []
Index: []
why empty dataframe has no columns.
runtimewaring:
/data/pyrun/lib/python2.7/site-packages/pandas/core/groupby.py:3672: RuntimeWarning: divide by zero encountered in log
if alpha + beta * ngroups < count * np.log(count):
/data/pyrun/lib/python2.7/site-packages/pandas/core/groupby.py:3672: RuntimeWarning: invalid value encountered in double_scalars
if alpha + beta * ngroups < count * np.log(count):
You need as_index=False
and group_keys=False
:
df = df.groupby(["PlatformCategory","Platform","ResClassName"], as_index=False).count()
df
Empty DataFrame
Columns: [PlatformCategory, Platform, ResClassName, Amount]
Index: []
No need to reset your index afterwards.
Some code that works the same for .sum()
whether or not the dataframe is empty:
def groupby_sum(df, groupby_cols):
groupby = df.groupby(groupby_cols, as_index=False)
summed = groupby.sum()
return (groupby.count() if summed.empty else summed).set_index(groupby_cols)
df = groupby_sum(df, ["PlatformCategory", "Platform", "ResClassName"])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With