Python Scipy: scipy.stats.spearmanr returning nans

Tags:

Edit: Basically solved I think.

I am using spearmanr from scipy.stats to find the correlations between variables across a number of different samples. I have around 2500 variables and 36 samples (or 'observations')

If I calculate the correlations using all 36 samples, spearmanr works fine. If I use only the first 18 samples it also works fine. However if I use the latter 18 samples I get an error and nans are returned.

This is the error:

/Home/s1215235/.local/lib/python2.7/site-packages/numpy/lib/function_base.py:1945: RuntimeWarning: invalid value encountered in true_divide
return c / sqrt(multiply.outer(d, d))
/Home/s1215235/.local/lib/python2.7/site-packages/scipy/stats/_distn_infrastructure.py:1718: RuntimeWarning: invalid value encountered in greater
cond1 = (scale > 0) & (x > self.a) & (x < self.b)
/Home/s1215235/.local/lib/python2.7/site-packages/scipy/stats/_distn_infrastructure.py:1718: RuntimeWarning: invalid value encountered in less
cond1 = (scale > 0) & (x > self.a) & (x < self.b)
/Home/s1215235/.local/lib/python2.7/site-packages/scipy/stats/_distn_infrastructure.py:1719: RuntimeWarning: invalid value encountered in less_equal
cond2 = cond0 & (x <= self.a)

This is the code:

populationdata = np.vstack(thing).astype(np.float)
rho, pval = stats.spearmanr(populationdata[:,sampleindexes], axis = 1)

(populationdata is a numpy array full of floats; [:,sampleindexes] allows only a few of the columns to be used.

And this is what rho is returned as:

[[ 1.                 nan         nan ...,  1.         -0.05882353
  -0.08574929]
 [        nan         nan         nan ...,         nan         nan
          nan]
 [        nan         nan         nan ...,         nan         nan
          nan]
 ..., 
 [ 1.                 nan         nan ...,  1.         -0.05882353
  -0.08574929]
 [-0.05882353         nan         nan ..., -0.05882353  1.          0.68599434]
 [-0.08574929         nan         nan ..., -0.08574929  0.68599434  1.        ]]

540

asked Aug 20 '15 10:08

Catherine Georgia

1 Answers

In a comment it was noted that "There are a lot of 0s though." So populationdata[:,sampleindexes] probably has rows that are all 0. That will cause spearmanr to generate nan. For example,

In [3]: spearmanr([[0, 0, 0], [1, 2, 3]], axis=1)
/Users/warren/anaconda/lib/python2.7/site-packages/numpy/lib/function_base.py:1957: RuntimeWarning: invalid value encountered in true_divide
  return c / sqrt(multiply.outer(d, d))
/Users/warren/anaconda/lib/python2.7/site-packages/scipy/stats/_distn_infrastructure.py:1728: RuntimeWarning: invalid value encountered in greater
  cond1 = (scale > 0) & (x > self.a) & (x < self.b)
/Users/warren/anaconda/lib/python2.7/site-packages/scipy/stats/_distn_infrastructure.py:1728: RuntimeWarning: invalid value encountered in less
  cond1 = (scale > 0) & (x > self.a) & (x < self.b)
/Users/warren/anaconda/lib/python2.7/site-packages/scipy/stats/_distn_infrastructure.py:1729: RuntimeWarning: invalid value encountered in less_equal
  cond2 = cond0 & (x <= self.a)
Out[3]: (nan, nan)

165

answered Oct 13 '22 08:10

Warren Weckesser

Related questions
                            
                                Python 2 vs 3: Lambda Operator [duplicate]
                            
                                Python's and Numpy's nan and set
                            
                                Pandas: how to convert a cell with multiple values to multiple rows?
                            
                                No module named osgeo.ogr
                            
                                Python multiprocessing (joblib) best way for argument passing
                            
                                Read a tab separated file with first column as key and the rest as values
                            
                                How can memoized functions be tested?
                            
                                Import only functions from a python file
                            
                                how to delete text to end of line with curses
                            
                                Django model u'id' clashes when using OneToOneField
                            
                                How to reverse query objects for multiple levels in django?
                            
                                Break up Random forest classification fit into pieces in python?
                            
                                Python Django PDFKIT - [Errno 9] Bad file descriptor
                            
                                Perl's correspondent string literal for Python's prefix r"text"?
                            
                                SRGB-aware image resize in Pillow
                            
                                Reply to email using python 3.4
                            
                                Where do prints go when running Flask with Apache?
                            
                                Why don't cython compile logic or to `||` expression?
                            
                                How to make "Copy to clipboard" button/link in django admin for selected field?
                            
                                How to trigger Python script on Raspberry Pi from Node-Red

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python Scipy: scipy.stats.spearmanr returning nans

Tags:

python

scipy

correlation

Catherine Georgia

People also ask

1 Answers

Warren Weckesser

Recent Activity

Donate For Us