Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to calculate percentage with Pandas' DataFrame

Tags:

python

pandas

How to add another column to Pandas' DataFrame with percentage? The dict can change on size.

>>> import pandas as pd
>>> a = {'Test 1': 4, 'Test 2': 1, 'Test 3': 1, 'Test 4': 9}
>>> p = pd.DataFrame(a.items())
>>> p
        0  1
0  Test 2  1
1  Test 3  1
2  Test 1  4
3  Test 4  9

[4 rows x 2 columns]
like image 415
user977828 Avatar asked May 08 '14 10:05

user977828


People also ask

How do you calculate percentages in Python?

Use the division / operator to divide one number by another. Multiply the quotient by 100 to get the percentage. The result shows what percent the first number is of the second.

How do I add a percentage to a column in Python?

Convert Numeric to Percentage String To convert it back to percentage string, we will need to use python's string format syntax '{:. 2%}'. format to add the '%' sign back. Then we use python's map() function to iterate and apply the formatting to all the rows in the 'median_listing_price_yy' column.

How do you execute a percentage?

Multiply the number by 100. If you are required to convert a decimal number like 0.57 to a percentage, you simply multiply it by 100. That is, 0.57 x 100 = 57. Therefore, 0.57 as a percentage equals 57%.


2 Answers

If indeed percentage of 10 is what you want, the simplest way is to adjust your intake of the data slightly:

>>> p = pd.DataFrame(a.items(), columns=['item', 'score'])
>>> p['perc'] = p['score']/10
>>> p
Out[370]: 
     item  score  perc
0  Test 2      1   0.1
1  Test 3      1   0.1
2  Test 1      4   0.4
3  Test 4      9   0.9

For real percentages, instead:

>>> p['perc']= p['score']/p['score'].sum()
>>> p
Out[427]: 
     item  score      perc
0  Test 2      1  0.066667
1  Test 3      1  0.066667
2  Test 1      4  0.266667
3  Test 4      9  0.600000
like image 97
FooBar Avatar answered Sep 29 '22 11:09

FooBar


First, make the keys of your dictionary the index of you dataframe:

 import pandas as pd
 a = {'Test 1': 4, 'Test 2': 1, 'Test 3': 1, 'Test 4': 9}
 p = pd.DataFrame([a])
 p = p.T # transform
 p.columns = ['score']

Then, compute the percentage and assign to a new column.

 def compute_percentage(x):
      pct = float(x/p['score'].sum()) * 100
      return round(pct, 2)

 p['percentage'] = p.apply(compute_percentage, axis=1)

This gives you:

         score  percentage
 Test 1      4   26.67
 Test 2      1    6.67
 Test 3      1    6.67
 Test 4      9   60.00

 [4 rows x 2 columns]
like image 37
joemar.ct Avatar answered Sep 29 '22 13:09

joemar.ct