I want to get a percentage of a particular value in a df column. Say I have a df with (col1, col2 , col3, gender) gender column has values of M, F, or Other. I want to get the percentage of M, F, Other values in the df.
I have tried this, which gives me the number M, F, Other instances, but I want these as a percentage of the total number of values in the df.
df.groupby('gender').size()
Can someone help?
You can caluclate pandas percentage with total by groupby() and DataFrame. transform() method. The transform() method allows you to execute a function for each value of the DataFrame. Here, the percentage directly summarized DataFrame, then the results will be calculated using all the data.
To count the frequency of a value in a DataFrame column in Pandas, we can use df. groupby(column name). size() method.
You can count the number of duplicate rows by counting True in pandas. Series obtained with duplicated() . The number of True can be counted with sum() method. If you want to count the number of False (= the number of non-duplicate rows), you can invert it with negation ~ and then count True with sum() .
Return a Series containing counts of unique values. The resulting object will be in descending order so that the first element is the most frequently-occurring element.
Use value_counts
with normalize=True
:
df['gender'].value_counts(normalize=True) * 100
The result is a fraction in range (0, 1]. We multiply by 100 here in order to get the %.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With