Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Add column for percentage of total to Pandas dataframe

I have a dataframe that I am doing a groupby() on to get the counts on a column's values. I am trying to add an additional column for "Percentage of Total". I'm not sure how to accomplish that.

I've looked at a few groupby options, but can't seem to find anything that fits.

My dataframe looks like this:

              DAYSLATE
DAYSLATE          
-7 days          1
-5 days          2
-3 days          8
-2 days          9
-1 days         45
0 days         589
1 days          33
2 days           8
3 days          16
4 days          14
5 days          16
6 days           2
7 days           6
8 days           2
9 days           2
10 days          1
like image 949
AlliDeacon Avatar asked Jun 26 '17 18:06

AlliDeacon


People also ask

How do I add percentages in pandas?

You can caluclate pandas percentage with total by groupby() and DataFrame. transform() method. The transform() method allows you to execute a function for each value of the DataFrame. Here, the percentage directly summarized DataFrame, then the results will be calculated using all the data.

How do I format a column into a percentage in python?

Convert Numeric to Percentage String To convert it back to percentage string, we will need to use python's string format syntax '{:. 2%}'. format to add the '%' sign back. Then we use python's map() function to iterate and apply the formatting to all the rows in the 'median_listing_price_yy' column.

How do you find the percentage of a total in Python?

To calculate a percentage in Python, use the division operator (/) to get the quotient from two numbers and then multiply this quotient by 100 using the multiplication operator (*) to get the percentage. This is a simple equation in mathematics to get the percentage.

How do you display the null value of a percentage against all columns?

To find the percentage of missing values in each column of an R data frame, we can use colMeans function with is.na function. This will find the mean of missing values in each column. After that we can multiply the output with 100 to get the percentage.


1 Answers

Option 1

df['DAYSLATE_pct'] = df.DAYSLATE / df.DAYSLATE.sum()

Option 2
Use pd.value_counts instead of groupby

pre_df.DAYSLATE.value_counts(normalize=True)
like image 129
piRSquared Avatar answered Sep 17 '22 14:09

piRSquared