<pre class="prettyprint"><code>df: name group S1 S2 S3 A mn 1 2 8 B mn 4 3 5 C kl 5 8 2 D kl 6 5 5 E fh 7 1 3 output: std (S1,S2,S3) 3.78 1 3 0.57 3.05 </code></pre> This is working for getting std for a column: <pre class="prettyprint"><code>numpy.std(df['A']) </code></pre> I want to do the same for rows

You can use <code>DataFrame.std</code>, which omit non numeric columns: <pre class="prettyprint"><code>print (df.std()) S1 2.302173 S2 2.774887 S3 2.302173 dtype: float64 </code></pre> If need <code>std</code> by columns: <pre class="prettyprint"><code>print (df.std(axis=1)) 0 3.785939 1 1.000000 2 3.000000 3 0.577350 4 3.055050 dtype: float64 </code></pre> If need select only some numeric columns, use subset: <pre class="prettyprint"><code>print (df[['S1','S2']].std()) S1 2.302173 S2 2.774887 dtype: float64 </code></pre> There is different with <code>numpy.std</code> by default parameter <code>ddof</code> (Delta Degrees of Freedom): <ul> <li>pandas by default <code>ddof=1</code> </li> <li>numpy by default <code>ddof=0</code> </li> </ul> So there are different outputs: <pre class="prettyprint"><code>#ddof=1 print (df.std(axis=1)) 0 3.785939 1 1.000000 2 3.000000 3 0.577350 4 3.055050 dtype: float64 #ddof=0 print (np.std(df, axis=1)) 0 3.091206 1 0.816497 2 2.449490 3 0.471405 4 2.494438 dtype: float64 </code></pre> But you can change it very easy: <pre class="prettyprint"><code>#same output as pandas function print (np.std(df, ddof=1, axis=1)) 0 3.785939 1 1.000000 2 3.000000 3 0.577350 4 3.055050 dtype: float64 #same output as numpy function print (df.std(ddof=0, axis=1)) 0 3.091206 1 0.816497 2 2.449490 3 0.471405 4 2.494438 dtype: float64 </code></pre>

How I can calculate standard deviation for rows of a dataframe?

df:  

name   group   S1   S2  S3        
A      mn      1    2   8         
B      mn      4    3   5        
C      kl      5    8   2        
D      kl      6    5   5         
E      fh      7    1   3         

output: 

std (S1,S2,S3)
3.78
1
3
0.57
3.05

This is working for getting std for a column:

numpy.std(df['A'])

I want to do the same for rows

382

asked Jul 13 '16 20:07

NamAshena

1 Answers

You can use DataFrame.std, which omit non numeric columns:

print (df.std())
S1    2.302173
S2    2.774887
S3    2.302173
dtype: float64

If need std by columns:

print (df.std(axis=1))
0    3.785939
1    1.000000
2    3.000000
3    0.577350
4    3.055050
dtype: float64

If need select only some numeric columns, use subset:

print (df[['S1','S2']].std())
S1    2.302173
S2    2.774887
dtype: float64

There is different with numpy.std by default parameter ddof (Delta Degrees of Freedom):

pandas by default ddof=1
numpy by default ddof=0

So there are different outputs:

#ddof=1
print (df.std(axis=1))
0    3.785939
1    1.000000
2    3.000000
3    0.577350
4    3.055050
dtype: float64

#ddof=0
print (np.std(df, axis=1))
0    3.091206
1    0.816497
2    2.449490
3    0.471405
4    2.494438
dtype: float64

But you can change it very easy:

#same output as pandas function
print (np.std(df, ddof=1, axis=1))
0    3.785939
1    1.000000
2    3.000000
3    0.577350
4    3.055050
dtype: float64

#same output as numpy function
print (df.std(ddof=0, axis=1))
0    3.091206
1    0.816497
2    2.449490
3    0.471405
4    2.494438
dtype: float64

answered Nov 07 '22 17:11

jezrael

Related questions
                            
                                How to remove/omit smaller contour lines using matplotlib
                            
                                What is the difference between single, double, and triple quotes in Python? [duplicate]
                            
                                Django REST framework tuple being interpreted as a string?
                            
                                Why use a classmethod over an Instance method in python
                            
                                How to merge multiple arrays in python?
                            
                                Leading b in python pika response
                            
                                Remove all padding from Bokeh Plot
                            
                                Cannot install pip packages due to locale.error inside Ubuntu Vagrant Box
                            
                                find the set of integers for which two linear equalities holds true
                            
                                Pythonic way of comparing all adjacent elements in a list
                            
                                str.contains to create new column in pandas dataframe
                            
                                Slice efficiently pandas datetime index by a specific time
                            
                                Add 'constant' dimension to xarray Dataset
                            
                                jupyter: how to stop execution on errors?
                            
                                Extracting weights from .caffemodel without caffe installed in Python
                            
                                Regex to match PEP440 compliant version strings
                            
                                How to perform cython files compilation in parallel?
                            
                                How to fix Selenium WebDriverException: "The browser appears to have exited"
                            
                                Pandas: No column names in data file
                            
                                How to calculate the similarity of English words that do not appear in WordNet?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How I can calculate standard deviation for rows of a dataframe?

Tags:

python

pandas

numpy

NamAshena

People also ask

1 Answers

jezrael

Recent Activity

Donate For Us