Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dask: how to groupby, aggregate without losing column used for groupby

How do one get a SQL-style grouped output when grouping following data:

   item   frequency
    A      5
    A      9
    B      2
    B      4
    C      6

df.groupby(by = ["item"]).sum()

results in this:

  item   frequency
    A      14
    B      6
    C      6

In pandas it is achieved by setting as_index=False. But dask doesn't support this argument in groupby. It currently omits item column and returns the series with frequency column.

like image 619
Omley Avatar asked Feb 11 '18 14:02

Omley


1 Answers

Perhaps call .reset_index afterwards?

like image 124
MRocklin Avatar answered Sep 30 '22 13:09

MRocklin