Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating Percentile in Python Pandas Dataframe [duplicate]

I'm trying to calculate the percentile of each number within a dataframe and add it to a new column called 'percentile'.

This is my attempt:

import pandas as pd
from scipy import stats

data = {'symbol':'FB','date':['2012-05-18','2012-05-21','2012-05-22','2012-05-23'],'close':[38.23,34.03,31.00,32.00]}

df = pd.DataFrame(data)

close = df['close']

for i in df:
    df['percentile'] = stats.percentileofscore(close,df['close'])

The column is not being filled and results in 'NaN'. This should be fairly easy, but I'm not sure where I'm going wrong.

Thanks in advance for the help.

like image 934
mattblack Avatar asked Jun 18 '17 03:06

mattblack


People also ask

How do you find the 95th percentile in pandas?

Note that when using the pandas quantile() function pass the value of the nth percentile as a fractional value. For example, pass 0.95 to get the 95th percentile value.


1 Answers

df.close.apply(lambda x: stats.percentileofscore(df.close.sort_values(),x))

or

df.close.rank(pct=True)

Output:

0    1.00
1    0.75
2    0.25
3    0.50
Name: close, dtype: float64
like image 56
Scott Boston Avatar answered Sep 19 '22 10:09

Scott Boston