Hurst Exponent in python

Question

from datetime import datetime
from pandas.io.data import DataReader
from numpy import cumsum, log, polyfit, sqrt, std, subtract
from numpy.random import randn

def hurst(ts):

    """Returns the Hurst Exponent of the time series vector ts"""
    # Create the range of lag values
    lags = range(2, 100)

    # Calculate the array of the variances of the lagged differences
    # Here it calculates the variances, but why it uses 
    # standard deviation and then make a root of it?
    tau = [sqrt(std(subtract(ts[lag:], ts[:-lag]))) for lag in lags]

    # Use a linear fit to estimate the Hurst Exponent
    poly = polyfit(log(lags), log(tau), 1)

    # Return the Hurst exponent from the polyfit output
    return poly[0]*2.0


# Download the stock prices series from Yahoo
aapl = DataReader("AAPL", "yahoo", datetime(2012,1,1), datetime(2015,9,18))

# Call the function
hurst(aapl['Adj Close'])

From this code for estimating Hurst Exponent, when we want to calculate the variance of the lagged difference, why we still use a standard deviation and take a square root? I am confused for a long time, and I don't know why others don't have the same confuse. Do I misunderstand the math behind it? Thanks!

Alexander McFarlane · Accepted Answer

All that is going on here is a variation on math notation

I'll define

d = subtract(ts[lag:], ts[:-lag])

Then it is clear that

np.log(np.std(d)**2) == np.log(np.var(d))
np.log(np.std(d)) == .5*np.log(np.var(d))

Then you have the equivalence

2*np.log(np.sqrt(np.std(d))) == .5*np.log(np.sqrt(np.var(d)))

The functional output of polyfit scales proportionally to its input

Wei · Answer

I'm just as confused. I don't understand where the sqrt of std comes from either, and have spent 3 days trying to figure it out. In the end I noticed QuantStart credits Dr Tom Starke who uses a slightly different code. Dr Tom Starke credits Dr Ernie Chan, and going to his blog. I was able to find enough information to put together my own code from his principles. This doesn't use sqrt, uses variance instead of std and uses a 2.0 divisor at the end instead of a 2.0 multiplier. In the end, it seems to give the same results as the quantstart code you post, but I am able to understand it from first principles, which I guess is important. I put together a Jupyter Notebook which makes it clearer, but I'm not sure if I can post that here, so I will try to explain as best I can here. Code is pasted first, then an explanation.

lags = range(2,100)
def hurst_ernie_chan(p):

    variancetau = []; tau = []

    for lag in lags: 

        #  Write the different lags into a vector to compute a set of tau or lags
        tau.append(lag)

        # Compute the log returns on all days, then compute the variance on the difference in log returns
        # call this pp or the price difference
        pp = subtract(p[lag:], p[:-lag])
        variancetau.append(var(pp))

    # we now have a set of tau or lags and a corresponding set of variances.
    #print tau
    #print variancetau

    # plot the log of those variance against the log of tau and get the slope
    m = polyfit(log10(tau),log10(variancetau),1)

    hurst = m[0] / 2

    return hurst

Dr Chan doesn't give any code on this page (I believe he works in MATLAB not Python anyway). Hence I needed to put together my own code from the notes he gives in his blog and answers he gives to questions posed on his blog.

Dr Chan states that if z is the log price, then volatility, sampled at intervals of τ, is volatility(τ)=√(Var(z(t)-z(t-τ))). To me another way of describing volatility is standard deviation, so std(τ)=√(Var(z(t)-z(t-τ)))
std is just the root of variance so var(τ)=(Var(z(t)-z(t-τ)))
Dr Chan then states: In general, we can write Var(τ) ∝ τ^(2H) where H is the Hurst exponent
Hence (Var(z(t)-z(t-τ))) ∝ τ^(2H)
Taking the log of each side we get log (Var(z(t)-z(t-τ))) ∝ 2H log τ
[ log (Var(z(t)-z(t-τ))) / log τ ] / 2 ∝ H (gives the Hurst exponent) where we know the term in square brackets on far left is the slope of a log-log plot of tau and a corresponding set of variances.

If you run that function and compare the answers to the Quantstart function, they should be the same. Not sure if that helped.

Sergey Bushmanov · Answer

As per intuitive definition taken from Ernest Chan's "Algorithmic trading" (p.44):

Intuitively speaking, a “stationary” price series means that the prices diffuse from its initial value more slowly than a geometric random walk would.

one would want to check variance of time series with increasing lags against lag(s). This is because for normal distribution -- and log prices are believed to be normal (to certain extent) -- variance of sum of normal distributions is a sum of constituents' variances.

As per Ernest Chan's citation, for mean reverting processes the realized variance will be less than theoretically projected.

Putting this in code:

def hurst(p, l):
    """
    Arguments:
        p: ndarray -- the price series to be tested
        l: list of integers or an integer -- lag(s) to test for mean reversion
    Returns:
        Hurst exponent
    """
    if isinstance(l, int):
        lags = [1, l]
    else:
        lags = l
    assert lags[-1] >=2, "Lag in prices must be greater or equal 2"
    print(f"Price lags of {lags[1:]} are included")
    lp = np.log(p)
    var = [np.var(lp[l:] - lp[:-l]) for l in lags]
    hr = linregress(np.log(lags), np.log(var))[0] / 2
    return hr

Ananya · Answer

The code posted by OP is correct.

The reason for the confusion is that it does a square-root first, and then counters it by multiplying the slope (returned by polyfit) with 2.

For a more detailed explanation, continue reading.

tau is calculated with an "extra" square-root. Then, its log is calculated. log(sqrt(x)) = log(x^0.5) = 0.5*log(x) (this is the key). polyfit now conducts the fitting with y multiplied by an "extra 0.5". So, the result obtained is also multiplied by nearly 0.5. Returning twice of that (return poly[0]*2.0) counters the initial (seemingly) extra 0.5.

Hope this makes it clearer.

Hurst Exponent in python

Tags:

python

math

finance

Ling Jin

4 Answers

Alexander McFarlane

Wei

Sergey Bushmanov

Ananya

Recent Activity

Donate For Us

Hurst Exponent in python

Tags:

python

math

finance

Ling Jin

4 Answers

Alexander McFarlane

Wei

Sergey Bushmanov

Ananya

Related questions

Recent Activity

Donate For Us