I am trying to display the output of percentile distribution for each column as a dataframe as I want to export it to csv later.
I have simply looped all the columns like this :
for column in data:
print(data[column].describe([.01,.1,.2,.3,.4,.5,.6,.7,.8,.9,.99]))
However, I couldn't figure out how to do rest. Any help is hugely appreciated!
+Editing the main question with an additional query :
I also want to group my output by a column such as data.groupby(data['MARKET']).describe([.01,.1,.2,.3,.4,.5,.6,.7,.8,.9,.99])
However, I receive the error like "describe() takes 1 positional argument but 2 were given". How can I deal with this problem?
Sample data set:
d = {'col1': [1, 2, 3, 2, 1],
'col2': [3, 4, 5, 6, 7],
'country': ['TR', 'UK', 'UK' , 'TR', 'TR']};
df = pd.DataFrame(data=d)
Is that what you want?
In [19]: df = pd.DataFrame(np.arange(15).reshape(5,3)).add_prefix('col')
In [20]: df
Out[20]:
col0 col1 col2
0 0 1 2
1 3 4 5
2 6 7 8
3 9 10 11
4 12 13 14
In [21]: df.describe([.01,.1,.2,.3,.4,.5,.6,.7,.8,.9,.99])
Out[21]:
col0 col1 col2
count 5.000000 5.000000 5.000000
mean 6.000000 7.000000 8.000000
std 4.743416 4.743416 4.743416
min 0.000000 1.000000 2.000000
1% 0.120000 1.120000 2.120000
10% 1.200000 2.200000 3.200000
20% 2.400000 3.400000 4.400000
30% 3.600000 4.600000 5.600000
40% 4.800000 5.800000 6.800000
50% 6.000000 7.000000 8.000000
60% 7.200000 8.200000 9.200000
70% 8.400000 9.400000 10.400000
80% 9.600000 10.600000 11.600000
90% 10.800000 11.800000 12.800000
99% 11.880000 12.880000 13.880000
max 12.000000 13.000000 14.000000
UPDATE:
d = {'col1': [1, 2, 3, 2, 1], 'col2': [3, 4, 5, 6, 7], 'country': ['TR', 'UK', 'UK' , 'TR', 'TR']};
df = pd.DataFrame(data=d)
In [29]: df.groupby('country').apply(lambda x: x.describe([.01,.1,.2,.3,.4,.5,.6,.7,.8,.9,.99]))
Out[29]:
col1 col2
country
TR count 3.000000 3.000000
mean 1.333333 5.333333
std 0.577350 2.081666
min 1.000000 3.000000
1% 1.000000 3.060000
10% 1.000000 3.600000
20% 1.000000 4.200000
30% 1.000000 4.800000
40% 1.000000 5.400000
50% 1.000000 6.000000
60% 1.200000 6.200000
70% 1.400000 6.400000
80% 1.600000 6.600000
90% 1.800000 6.800000
99% 1.980000 6.980000
max 2.000000 7.000000
UK count 2.000000 2.000000
mean 2.500000 4.500000
std 0.707107 0.707107
min 2.000000 4.000000
1% 2.010000 4.010000
10% 2.100000 4.100000
20% 2.200000 4.200000
30% 2.300000 4.300000
40% 2.400000 4.400000
50% 2.500000 4.500000
60% 2.600000 4.600000
70% 2.700000 4.700000
80% 2.800000 4.800000
90% 2.900000 4.900000
99% 2.990000 4.990000
max 3.000000 5.000000
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With