Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas get frequency of item occurrences in a column as percentage [duplicate]

I want to get a percentage of a particular value in a df column. Say I have a df with (col1, col2 , col3, gender) gender column has values of M, F, or Other. I want to get the percentage of M, F, Other values in the df.

I have tried this, which gives me the number M, F, Other instances, but I want these as a percentage of the total number of values in the df.

df.groupby('gender').size() 

Can someone help?

like image 300
SANM2009 Avatar asked May 28 '18 02:05

SANM2009


People also ask

How do I get the percentage of a column in pandas?

You can caluclate pandas percentage with total by groupby() and DataFrame. transform() method. The transform() method allows you to execute a function for each value of the DataFrame. Here, the percentage directly summarized DataFrame, then the results will be calculated using all the data.

How do you count the frequency of a column in pandas?

To count the frequency of a value in a DataFrame column in Pandas, we can use df. groupby(column name). size() method.

How do you count the number of repeated values in pandas?

You can count the number of duplicate rows by counting True in pandas. Series obtained with duplicated() . The number of True can be counted with sum() method. If you want to count the number of False (= the number of non-duplicate rows), you can invert it with negation ~ and then count True with sum() .

What does Value_counts () do in pandas?

Return a Series containing counts of unique values. The resulting object will be in descending order so that the first element is the most frequently-occurring element.


1 Answers

Use value_counts with normalize=True:

df['gender'].value_counts(normalize=True) * 100 

The result is a fraction in range (0, 1]. We multiply by 100 here in order to get the %.

like image 103
cs95 Avatar answered Sep 19 '22 23:09

cs95