Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - generate array of specific autocorrelation

I am interested in generating an array(or numpy Series) of length N that will exhibit specific autocorrelation at lag 1. Ideally, I want to specify the mean and variance, as well, and have the data drawn from (multi)normal distribution. But most importantly, I want to specify the autocorrelation. How do I do this with numpy, or scikit-learn?

Just to be explicit and precise, this is the autocorrelation I want to control:

numpy.corrcoef(x[0:len(x) - 1], x[1:])[0][1]
like image 922
Baron Yugovich Avatar asked Nov 24 '15 16:11

Baron Yugovich


1 Answers

If you are interested only in the auto-correlation at lag one, you can generate an auto-regressive process of order one with the parameter equal to the desired auto-correlation; this property is mentioned on the Wikipedia page, but it's not hard to prove it.

Here is some sample code:

import numpy as np

def sample_signal(n_samples, corr, mu=0, sigma=1):
    assert 0 < corr < 1, "Auto-correlation must be between 0 and 1"

    # Find out the offset `c` and the std of the white noise `sigma_e`
    # that produce a signal with the desired mean and variance.
    # See https://en.wikipedia.org/wiki/Autoregressive_model
    # under section "Example: An AR(1) process".
    c = mu * (1 - corr)
    sigma_e = np.sqrt((sigma ** 2) * (1 - corr ** 2))

    # Sample the auto-regressive process.
    signal = [c + np.random.normal(0, sigma_e)]
    for _ in range(1, n_samples):
        signal.append(c + corr * signal[-1] + np.random.normal(0, sigma_e))

    return np.array(signal)

def compute_corr_lag_1(signal):
    return np.corrcoef(signal[:-1], signal[1:])[0][1]

# Examples.
print(compute_corr_lag_1(sample_signal(5000, 0.5)))
print(np.mean(sample_signal(5000, 0.5, mu=2)))
print(np.std(sample_signal(5000, 0.5, sigma=3)))

The parameter corr lets you set the desired auto-correlation at lag one and the optional parameters, mu and sigma, let you control the mean and standard deviation of the generated signal.

like image 56
Dan Oneață Avatar answered Oct 21 '22 15:10

Dan Oneață