Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas convert columns to percentages of the totals

Tags:

python

pandas

I have a dataframe with 4 columns an ID and three categories that results fell into

  <80% 80-90 >90
id
1   2     4    4
2   3     6    1
3   7     0    3

I would like to convert it to percentages ie:

   <80% 80-90 >90
id
1   20%   40%  40%
2   30%   60%  10%
3   70%    0%  30%

this seems like it should be within pandas capabilities but I just can't figure it out.

Thanks in advance!

like image 295
DTATSO Avatar asked Feb 02 '17 15:02

DTATSO


People also ask

How do I format a column into a percentage in python?

Convert Numeric to Percentage String To convert it back to percentage string, we will need to use python's string format syntax '{:. 2%}'. format to add the '%' sign back. Then we use python's map() function to iterate and apply the formatting to all the rows in the 'median_listing_price_yy' column.

How do you find the percentage of a total in pandas?

You can caluclate pandas percentage with total by groupby() and DataFrame. transform() method. The transform() method allows you to execute a function for each value of the DataFrame. Here, the percentage directly summarized DataFrame, then the results will be calculated using all the data.

How do you find the percentage of a total in Python?

To calculate a percentage in Python, use the division operator (/) to get the quotient from two numbers and then multiply this quotient by 100 using the multiplication operator (*) to get the percentage.

How do you display the null value of a percentage against all columns?

To find the percentage of missing values in each column of an R data frame, we can use colMeans function with is.na function. This will find the mean of missing values in each column. After that we can multiply the output with 100 to get the percentage.


1 Answers

You can do this using basic pandas operators .div and .sum, using the axis argument to make sure the calculations happen the way you want:

cols = ['<80%', '80-90', '>90']
df[cols] = df[cols].div(df[cols].sum(axis=1), axis=0).multiply(100)
  • Calculate the sum of each column (df[cols].sum(axis=1). axis=1 makes the summation occur across the rows, rather than down the columns.
  • Divide the dataframe by the resulting series (df[cols].div(df[cols].sum(axis=1), axis=0). axis=0 makes the division happen across the columns.
  • To finish, multiply the results by 100 so they are percentages between 0 and 100 instead of proportions between 0 and 1 (or you can skip this step and store them as proportions).
like image 152
ASGM Avatar answered Oct 05 '22 10:10

ASGM