Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to calculate count and percentage in groupby in Python

I have following output after grouping by

Publisher.groupby('Category')['Title'].count()
Category
Coding          5
Hacking         7
Java            1
JavaScript      5
LEGO           43
Linux           7
Networking      5
Others        123
Python          8
R               2
Ruby            4
Scripting       4 
Statistics      2
Web             3

In the above output I want the percentage also i.e for the first row 5*100/219 and so on. I am doing following

 Publisher.groupby('Category')['Title'].agg({'Count':'count','Percentage':lambda x:x/x.sum()})

But it gives me an error. Please help

like image 965
Neil Avatar asked Oct 06 '16 10:10

Neil


People also ask

How to get percentage of total with groupby with Python pandas?

df ['sales'] / df.groupby ('state') ['sales'].transform ('sum') Then we use that as the divisor for df ['sales'] to divide each value in the sales column by the total. To get percentage of total with groupby with Python Pandas, we can use the transform method.

How do you do groupby count in a Dataframe in Python?

Groupby count in pandas dataframe python Groupby count in pandas python can be accomplished by groupby() function. Groupby count of multiple column and single column in pandas is accomplished by multiple ways some among them are groupby() function and aggregate() function.

How to calculate the percentage of a course in Python?

Now, you can calculate the percentage in a simpler way just groupby the Courses and divide Fee column by its sum by lambda function and DataFrame.apply () method. Here df2 is a Series of Multi Index with one column where values are all numeric.

How to group by single column in pandas python?

df1 will be Groupby single column – groupby count pandas python: groupby () function takes up the column name as argument followed by count () function as shown below 1


1 Answers

I think you can use:

P = Publisher.groupby('Category')['Title'].count().reset_index()
P['Percentage'] = 100 * P['Title']  / P['Title'].sum()

Sample:

Publisher = pd.DataFrame({'Category':['a','a','s'],
                   'Title':[4,5,6]})

print (Publisher)
  Category  Title
0        a      4
1        a      5
2        s      6

P = Publisher.groupby('Category')['Title'].count().reset_index()
P['Percentage'] = 100 * P['Title']  / P['Title'].sum()
print (P)
  Category  Title  Percentage
0        a      2   66.666667
1        s      1   33.333333
like image 50
jezrael Avatar answered Oct 11 '22 23:10

jezrael