I got two snippets code as follows. <pre class="prettyprint"><code>import numpy numpy.std([766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346]) 0 </code></pre> and <pre class="prettyprint"><code>import pandas as pd pd.Series([766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346]).std(ddof=0) 10.119288512538814 </code></pre> That's a huge difference. May I ask why?

This issue is indeed already under discussion (link); problem seems to be the algorithm for calculating the standard deviation which is used by <code>pandas</code> since it is not as numerically stable as the one used by <code>numpy</code>. An easy workaround would be to apply <code>.values</code> to the series first and then apply <code>std</code> to these values; in this case <code>numpy's</code> <code>std</code> is used: <pre class="prettyprint"><code>pd.Series([766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346]).values.std() </code></pre> which gives you the expected value 0.

Pandas: why pandas.Series.std() is quite different from numpy.std()

Tags:

python

pandas

numpy

I got two snippets code as follows.

import numpy
numpy.std([766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346])
0

and

import pandas as pd
pd.Series([766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346]).std(ddof=0)
10.119288512538814

That's a huge difference.

May I ask why?

216

asked Jul 02 '15 06:07

Tony

1 Answers

This issue is indeed already under discussion (link); problem seems to be the algorithm for calculating the standard deviation which is used by pandas since it is not as numerically stable as the one used by numpy.

An easy workaround would be to apply .values to the series first and then apply std to these values; in this case numpy's std is used:

pd.Series([766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346]).values.std()

which gives you the expected value 0.

179

answered Oct 14 '22 08:10

Cleb

Related questions
                            
                                Extracting age-related info from text
                            
                                Object detection in images (HOG)
                            
                                Change the order in which Django migrate app during testing
                            
                                Putting shlex in debug mode
                            
                                python and google cloud storage
                            
                                Use RFC2217 network serial ports with Twisted Python?
                            
                                Flask and sqlalchemy: Get uploaded file using path stored on database
                            
                                Mask area outside of imported shapefile (basemap/matplotlib)
                            
                                Python _winreg key path incorrect
                            
                                Python gzip: OverflowError size does not fit in an int
                            
                                How to collect output from a Python subprocess
                            
                                triggering different app environments with pyenv-virtualenv
                            
                                Fix Character encoding of webpage using python Mechanize
                            
                                How to save up memory while using Multiprocessing in Python?
                            
                                Installing github version of package with Anaconda
                            
                                Python 3.4 decode bytes
                            
                                Time complexity of swapping elements in a python list
                            
                                How to expand a string within a string in python?
                            
                                django rest framework 3 ImageField send ajax result “No file was submitted.”
                            
                                Python 3 join data from large files that are sorted

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With