I have an array:
Num Col2 Col3 Col4
1 6 1 1
2 60 0 2
3 60 0 1
4 6 0 1
5 60 1 1
And the code:
y = df.loc[:,'Col3'] # response
X = df.loc[:,['Col2','Col4']] # predictor
X = sm.add_constant(X) #add constant
est = sm.OLS(y, X) #build regression
est = est.fit() #full model
And when it reaches .fit() it raises an error which is:
Traceback (most recent call last):
File "D:\Users\Anna\workspace\mob1\mobols.py", line 36, in <module>
est = est.fit() #full model
File "C:\Python27\lib\site-packages\statsmodels\regression\linear_model.py", line 174, in fit
self.pinv_wexog, singular_values = pinv_extended(self.wexog)
File "C:\Python27\lib\site-packages\statsmodels\tools\tools.py", line 392, in pinv_extended
u, s, vt = np.linalg.svd(X, 0)
File "C:\Python27\lib\site-packages\numpy\linalg\linalg.py", line 1327, in svd
u, s, vt = gufunc(a, signature=signature, extobj=extobj)
File "C:\Python27\lib\site-packages\numpy\linalg\linalg.py", line 99, in _raise_linalgerror_svd_nonconvergence
raise LinAlgError("SVD did not converge")
numpy.linalg.linalg.LinAlgError: SVD did not converge
What's the problem? And how can I solve it?
Thank you
It looks like you're using Pandas and statsmodels. I ran your snippet and did not get the 'raise LinAlgError("SVD did not converge")' exception. Here's what I ran:
import numpy as np
import pandas
import statsmodels.api as sm
d = {'col2': [6, 60, 60, 6, 60], 'col3': [1, 0, 0, 0, 1], 'col4': [1, 2, 1, 1, 1]}
df = pandas.DataFrame(data=d, index=np.arange(1, 6))
print df
Prints:
col2 col3 col4
1 6 1 1
2 60 0 2
3 60 0 1
4 6 0 1
5 60 1 1
y = df.loc[:, 'col3']
X = df.loc[:, ['col2', 'col4']]
X = sm.add_constant(X)
est = sm.OLS(y, X)
est = est.fit()
print est.summary()
This prints:
OLS Regression Results
==============================================================================
Dep. Variable: col3 R-squared: 0.167
Model: OLS Adj. R-squared: -0.667
Method: Least Squares F-statistic: 0.2000
Date: Sat, 28 Mar 2015 Prob (F-statistic): 0.833
Time: 16:43:02 Log-Likelihood: -3.0711
No. Observations: 5 AIC: 12.14
Df Residuals: 2 BIC: 10.97
Df Model: 2
==============================================================================
coef std err t P>|t| [95.0% Conf. Int.]
------------------------------------------------------------------------------
const 1.0000 1.003 0.997 0.424 -3.316 5.316
col2 -8.674e-18 0.013 -6.62e-16 1.000 -0.056 0.056
col4 -0.5000 0.866 -0.577 0.622 -4.226 3.226
==============================================================================
Omnibus: nan Durbin-Watson: 1.500
Prob(Omnibus): nan Jarque-Bera (JB): 0.638
Skew: -0.000 Prob(JB): 0.727
Kurtosis: 1.250 Cond. No. 187.
==============================================================================
So this seems to work, so no problem, at least with this code. Could it be that you're calling the wrong matrix as df?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With