Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient Cointegration Test in Python

Tags:

I am wondering if there is a better way to test if two variables are cointegrated than the following method:

import numpy as np import statsmodels.api as sm import statsmodels.tsa.stattools as ts  y = np.random.normal(0,1, 250) x = np.random.normal(0,1, 250)  def cointegration_test(y, x):     # Step 1: regress on variable on the other      ols_result = sm.OLS(y, x).fit()      # Step 2: obtain the residual (ols_resuld.resid)     # Step 3: apply Augmented Dickey-Fuller test to see whether      #        the residual is unit root         return ts.adfuller(ols_result.resid) 

The above method works; however, it is not very efficient. When I run sm.OLS, a lot of things are calculated, not just the residuals, this of course increases the run time. I could of course write my own code that calculates just the residuals, but I don't think this will be very efficient either.

I looking for either a build in test that just tests for cointegration directly. I was thinking Pandas, but don't seem to be able to find anything. Or maybe there is a clever to test for cointegration without running a regression, or some efficient method.

I have to run a lot of cointegration tests, and it would nice to improve on my current method.

like image 994
Akavall Avatar asked Jul 06 '12 13:07

Akavall


People also ask

How do you test for cointegration?

Methods of Testing for CointegrationThe Engle-Granger Two-Step method starts by creating residuals based on the static regression and then testing the residuals for the presence of unit-roots. It uses the Augmented Dickey-Fuller Test (ADF) or other tests to test for stationarity units in time series.

What is the Johansen cointegration test?

Cointegration > Johansen's test is a way to determine if three or more time series are cointegrated. More specifically, it assesses the validity of a cointegrating relationship, using a maximum likelihood estimates (MLE) approach.

What is the null hypothesis for cointegration test?

The null hypothesis for the trace test is that the number of cointegration vectors is r = r* < k, vs. the alternative that r = k. Testing proceeds sequentially for r* = 1,2, etc. and the first non-rejection of the null is taken as an estimate of r.

How do you know if two series are cointegrated?

More formally, two series are cointegrated if they are both individually unit-root nonstationary (integrated of order 1: I(1)) but there exists a linear combination that is unit-root stationary (integrated of order 0: I(0)).


1 Answers

You could try the following:

import statsmodels.tsa.stattools as ts  result=ts.coint(x, y) 

Edit:

import statsmodels.tsa.stattools as ts import numpy as np import pandas as pd import pandas.io.data as web  data1 = web.DataReader('FB', data_source='yahoo',start='4/4/2015', end='4/4/2016')   data2 = web.DataReader('AAPL', data_source='yahoo',start='4/4/2015', end='4/4/2016')   data1['key']=data1.index  data2['key']=data2.index  result = pd.merge(data1, data2, on='key')   x1=result['Close_x']   y1=result['Close_y']   coin_result = ts.coint(x1, y1)  

The code is self explanatory:- 1) Import the necessary packages 2) Fetch data of Facebook and Apple stock for an year duration 3) Merge the data according to the date column 4) Choose the closing price 5) Conduct the cointegration test 6) The variable coin_result has the statistics of cointegration test

like image 123
Abhishek Kulkarni Avatar answered Sep 20 '22 18:09

Abhishek Kulkarni