I am trying to use groupby and np.std to calculate a standard deviation, but it seems to be calculating a sample standard deviation (with a degrees of freedom equal to 1).
Here is a sample.
#create dataframe
>>> df = pd.DataFrame({'A':[1,1,2,2],'B':[1,2,1,2],'values':np.arange(10,30,5)})
>>> df
A B values
0 1 1 10
1 1 2 15
2 2 1 20
3 2 2 25
#calculate standard deviation using groupby
>>> df.groupby('A').agg(np.std)
B values
A
1 0.707107 3.535534
2 0.707107 3.535534
#Calculate using numpy (np.std)
>>> np.std([10,15],ddof=0)
2.5
>>> np.std([10,15],ddof=1)
3.5355339059327378
Is there a way to use the population std calculation (ddof=0) with the groupby statement? The records I am using are not (not the example table above) are not samples, so I am only interested in population std deviations.
Pandas Groupby Standard Deviation The following is a step-by-step guide of what you need to do. Group the dataframe on the column(s) you want. Select the field(s) for which you want to estimate the standard deviation. Apply the pandas std() function directly or pass 'std' to the agg() function.
Coding a stdev() Function in Python sqrt() to take the square root of the variance. With this new implementation, we can use ddof=0 to calculate the standard deviation of a population, or we can use ddof=1 to estimate the standard deviation of a population using a sample of data.
groupby() function is used to split the data into groups based on some criteria. pandas objects can be split on any of their axes. The abstract definition of grouping is to provide a mapping of labels to group names. sort : Sort group keys.
You can pass additional args to np.std
in the agg
function:
In [202]:
df.groupby('A').agg(np.std, ddof=0)
Out[202]:
B values
A
1 0.5 2.5
2 0.5 2.5
In [203]:
df.groupby('A').agg(np.std, ddof=1)
Out[203]:
B values
A
1 0.707107 3.535534
2 0.707107 3.535534
For degree of freedom = 0
(This means that bins with one number will end up with std=0
instead of NaN
)
import numpy as np
def std(x):
return np.std(x)
df.groupby('A').agg(['mean', 'max', std])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With