I am trying to do a naive Bayes and after loading some data into a dataframe in Pandas, the describe function captures the data I want. I'd like to capture the mean and std from each column of the table but am unsure on how to do that. I've tried things like:
df.describe([mean])
df.describe(['mean'])
df.describe().mean
None are working. I was able to do something similar in R with summary but don't know how to do in Python. Can someone lend some advice?
Please try something like this:
df.describe(include='all').loc['mean']
You were close. You don't need any include
tag. Just rewrite your second approach correctly: df.describe()['mean']
For example:
import pandas as pd
s = pd.Series([1, 2, 3, 4, 5])
s.describe()['mean']
# 3.0
If you want both mean
and std
, just write df.describe()[['mean', 'std']]
. For example,
s.describe()[['mean', 'std']]
# mean 3.000000
# std 1.581139
# dtype: float64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With