I'm trying to calculate the percentile of each number within a dataframe and add it to a new column called 'percentile'.
This is my attempt:
import pandas as pd
from scipy import stats
data = {'symbol':'FB','date':['2012-05-18','2012-05-21','2012-05-22','2012-05-23'],'close':[38.23,34.03,31.00,32.00]}
df = pd.DataFrame(data)
close = df['close']
for i in df:
df['percentile'] = stats.percentileofscore(close,df['close'])
The column is not being filled and results in 'NaN'. This should be fairly easy, but I'm not sure where I'm going wrong.
Thanks in advance for the help.
Note that when using the pandas quantile() function pass the value of the nth percentile as a fractional value. For example, pass 0.95 to get the 95th percentile value.
df.close.apply(lambda x: stats.percentileofscore(df.close.sort_values(),x))
or
df.close.rank(pct=True)
Output:
0 1.00
1 0.75
2 0.25
3 0.50
Name: close, dtype: float64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With