Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does OLS raise LinAlgError: SVD did not converge?

Tags:

python

I have an array:

Num Col2 Col3  Col4  
1   6     1     1   
2   60    0     2   
3   60    0     1   
4   6     0     1   
5   60    1     1   

And the code:

y = df.loc[:,'Col3']  # response
X = df.loc[:,['Col2','Col4']]  # predictor
X = sm.add_constant(X) #add constant
est = sm.OLS(y, X) #build regression
est = est.fit() #full model

And when it reaches .fit() it raises an error which is:

Traceback (most recent call last):
File "D:\Users\Anna\workspace\mob1\mobols.py", line 36, in <module>
est = est.fit() #full model
File "C:\Python27\lib\site-packages\statsmodels\regression\linear_model.py", line 174, in fit
self.pinv_wexog, singular_values = pinv_extended(self.wexog)
File "C:\Python27\lib\site-packages\statsmodels\tools\tools.py", line 392, in pinv_extended
u, s, vt = np.linalg.svd(X, 0)
File "C:\Python27\lib\site-packages\numpy\linalg\linalg.py", line 1327, in svd
u, s, vt = gufunc(a, signature=signature, extobj=extobj)
File "C:\Python27\lib\site-packages\numpy\linalg\linalg.py", line 99, in _raise_linalgerror_svd_nonconvergence
raise LinAlgError("SVD did not converge")
numpy.linalg.linalg.LinAlgError: SVD did not converge

What's the problem? And how can I solve it?

Thank you

like image 980
Anya Avatar asked Oct 31 '22 07:10

Anya


1 Answers

It looks like you're using Pandas and statsmodels. I ran your snippet and did not get the 'raise LinAlgError("SVD did not converge")' exception. Here's what I ran:

import numpy as np
import pandas
import statsmodels.api as sm
d = {'col2': [6, 60, 60, 6, 60], 'col3': [1, 0, 0, 0, 1], 'col4': [1, 2, 1, 1, 1]}
df = pandas.DataFrame(data=d, index=np.arange(1, 6))
print df

Prints:

   col2  col3  col4
1     6     1     1
2    60     0     2
3    60     0     1
4     6     0     1
5    60     1     1

y = df.loc[:, 'col3']
X = df.loc[:, ['col2', 'col4']]
X = sm.add_constant(X)
est = sm.OLS(y, X)
est = est.fit()
print est.summary()

This prints:

                            OLS Regression Results                            
==============================================================================
Dep. Variable:                   col3   R-squared:                       0.167
Model:                            OLS   Adj. R-squared:                 -0.667
Method:                 Least Squares   F-statistic:                    0.2000
Date:                Sat, 28 Mar 2015   Prob (F-statistic):              0.833
Time:                        16:43:02   Log-Likelihood:                -3.0711
No. Observations:                   5   AIC:                             12.14
Df Residuals:                       2   BIC:                             10.97
Df Model:                           2                                         
==============================================================================
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
const          1.0000      1.003      0.997      0.424        -3.316     5.316
col2       -8.674e-18      0.013  -6.62e-16      1.000        -0.056     0.056
col4          -0.5000      0.866     -0.577      0.622        -4.226     3.226
==============================================================================
Omnibus:                          nan   Durbin-Watson:                   1.500
Prob(Omnibus):                    nan   Jarque-Bera (JB):                0.638
Skew:                          -0.000   Prob(JB):                        0.727
Kurtosis:                       1.250   Cond. No.                         187.
==============================================================================

So this seems to work, so no problem, at least with this code. Could it be that you're calling the wrong matrix as df?

like image 175
Scott Avatar answered Nov 11 '22 09:11

Scott