Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas df.describe() - how do I extract values into Dataframe?

I am trying to do a naive Bayes and after loading some data into a dataframe in Pandas, the describe function captures the data I want. I'd like to capture the mean and std from each column of the table but am unsure on how to do that. I've tried things like:

df.describe([mean])
df.describe(['mean'])
df.describe().mean

None are working. I was able to do something similar in R with summary but don't know how to do in Python. Can someone lend some advice?

like image 557
Vaslo Avatar asked Jan 27 '19 22:01

Vaslo


2 Answers

Please try something like this:

df.describe(include='all').loc['mean']
like image 134
milos.ai Avatar answered Sep 30 '22 18:09

milos.ai


You were close. You don't need any include tag. Just rewrite your second approach correctly: df.describe()['mean']

For example:

import pandas as pd

s = pd.Series([1, 2, 3, 4, 5])
s.describe()['mean']
# 3.0

If you want both mean and std, just write df.describe()[['mean', 'std']]. For example,

s.describe()[['mean', 'std']]
# mean    3.000000
# std     1.581139
# dtype: float64
like image 25
Sheldore Avatar answered Sep 30 '22 19:09

Sheldore