I have a DF as follows:
Date Bought | Fruit
2018-01 Apple
2018-02 Orange
2018-02 Orange
2018-02 Lemon
I wish to group the data by 'Date Bought' & 'Fruit' and count the occurrences.
Expected result:
Date Bought | Fruit | Count
2018-01 Apple 1
2018-02 Orange 2
2018-02 Lemon 1
What I get:
Date Bought | Fruit | Count
2018-01 Apple 1
2018-02 Orange 2
Lemon 1
Code used:
Initial attempt:
df.groupby(['Date Bought','Fruit'])['Fruit'].agg('count')
#2
df.groupby(['Date Bought','Fruit'])['Fruit'].agg('count').reset_index()
ERROR: Cannot insert Fruit, already exists
#3
df.groupby(['Date Bought','Fruit'])['Fruit'].agg('count').reset_index(inplace=True)
ERROR: Type Error: Cannot reset_index inplace on a Series to create a DataFrame
Documentation shows that the groupby function returns a 'groupby object' not a standard DF. How can I group the data as mentioned above and retain the DF format?
The problem here is that by resetting the index you'd end up with 2 columns with the same name. Because working with Series
is possible set parameter name
in Series.reset_index
:
df1 = (df.groupby(['Date Bought','Fruit'], sort=False)['Fruit']
.agg('count')
.reset_index(name='Count'))
print (df1)
Date Bought Fruit Count
0 2018-01 Apple 1
1 2018-02 Orange 2
2 2018-02 Lemon 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With