Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Interpretation on XCORR

I have a question on xcorr in Python. Say that I do the following:

output=plt.xcorr(x,y, maxlags=4)

Which time-series is lagged? The output will be the cross-correlation between x and y at time t= -4 to +4. So is the output referring to the cross-correlation between x and y as follow?:

enter image description here or it is the reverse between x and y? I tried to dig into the code of xcorr to get a better idea (see here) but I am bit lost ... np.correlate(x,y,mode = 2). What does mode = 2 means? I only see here the mode being = valid, full, or same.

like image 480
Plug4 Avatar asked Jun 24 '14 21:06

Plug4


1 Answers

The mode parameter determines what happens near the boundaries. If you have input vectors with length x and y (x > y):

  • valid / 0: you will only receive the portion of the convolution where both signals overlap (x-y+1 points)
  • same / 1: the length of the output vector is the same as the length of the longer input vector (x points)
  • full / 2: all data from the area where there is even a single sample of overlap between the signals (x+y-1 points)

The numbers for these modes are not very publicly defined, byt they can be found in numpy's source code. In any case xcorruses the full mode. (Actually, only the first letters of mode names matter when giving the mode for convolve or correlate.)

There is some confusion as to what these functions really do. numpy.correlate has two different behaviours depending on numpy version. Internally these are known as multiarray.correlate (old) and multiarray.correlate2 (new). numpy.convolve reverses the second input vector and uses then multiarray.correlate (i.e. the one deprecated for correlation).

So, if you want to be really sure, you test what happens. The basic function is the product between two vectors where the vectors are moved one position at a time. To clarify this, I'll use some numeric examples with two vectors.

a <= [1,2,3,4,5]
b <= [10,20]

let's first look at convolve:

numpy.convolve(a,b,mode='full') => [ 10, 40, 70, 100, 230, 100]

this is because:

    1  2  3  4  5  => 1 x 10 = 10
20 10

    1  2  3  4  5  => 1 x 20 + 2 x 10 = 40
   20 10

...

    1  2  3  4  5     => 5 x 20 = 100
               20 10

Different modes return the same data but truncated at each end.

For correlation:

numpy.correlate(a,b,mode='full') => [ 20, 50, 80, 110, 140, 50]

    1  2  3  4  5  => 1 x 20 = 20
10 20

    1  2  3  4  5  => 1 x 10 + 2 x 20 = 50
   10 20

...

    1  2  3  4  5     => 5 x 10 = 100
               10 20

So, basically the only difference with real numbers is that one of the vectors is mirrored. This has some consequences, such as convolution giving the same result if a and b is swapped, correlation giving reversed result in that case. With complex numbers correlate conjugates the second vector prior to the calculations above.


Back to matplotlib's xcorr graph. It receives two vectors x and y with equal lengths and calculates the cross-correlation of these vectors at different lags.

It first calculates the full convolution with numpy.correlate between x and y as shown above. Then it draws the correlation results from the full output vector at positions -maxlags..maxlags. The rule is that the second input vector is shifted. At the leftmost graph position the second vector y is at its leftmost position (i.e. shifted to the left from x).

The easiest way to check this may be:

xcorr([1.,2.,3.,4.,5.], [0,0,0,0,1.], normed=False, maxlags=4)
like image 150
DrV Avatar answered Sep 21 '22 00:09

DrV