Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cross correlation using mathdotnet

I have recently started using Mathdotnet Numerics statistical package to do data analysis in c#.

I am looking for the cross correlation function. Does Mathdotnet have an API for this?

Previously I have been using MATLAB xcorr or Python numpy.correlate. So I am looking for a C# equivalent of these.

I have looked through their documentation but it isn't very straightforward. https://numerics.mathdotnet.com/api/

like image 712
lsama Avatar asked Sep 26 '17 06:09

lsama


People also ask

How do you calculate cross-correlation?

Cross-Correlation It is calculated simply by multiplying and summing two-time series together. In the following example, graphs A and B are cross-correlated but graph C is not correlated to either.

What is cross-correlation method?

Cross-correlation is a measurement that tracks the movements of two or more sets of time series data relative to one another. It is used to compare multiple time series and objectively determine how well they match up with each other and, in particular, at what point the best match occurs.

Why cross-correlation is used?

Cross-correlation is used to evaluate the similarity between the spectra of two different systems, for example, a sample spectrum and a reference spectrum. This technique can be used for samples where background fluctuations exceed the spectral differences caused by changes in composition.

Is cross-correlation the same as correlation?

Correlation defines the degree of similarity between two indicates. If the indicates are alike, then the correlation coefficient will be 1 and if they are entirely different then the correlation coefficient will be 0. When two independent indicates are compared, this procedure will be called as cross-correlation.


2 Answers

Correlation can be calculated by any of the methods from MathNet.Numerics.Statistics.Correlation, like Pearson or Spearman. But if you're looking for results like the ones provided by Matlab's xcorr or autocorr, then you have to manually calculate the correlation using those methods for each lag/delay value between your input samples. Notice this example includes both, cross and auto correlation.

enter image description here

double fs = 50; //sampling rate, Hz
double te = 1; //end time, seconds
int size = (int)(fs * te); //sample size

var t = Enumerable.Range(0, size).Select(p => p / fs).ToArray();
var y1 = t.Select(p => p < te / 2 ? 1.0 : 0).ToArray();
var y2 = t.Select(p => p < te / 2 ? 1.0 - 2*p : 0).ToArray();

var r12 = StatsHelper.CrossCorrelation(y1, y2); // Y1 * Y2
var r21 = StatsHelper.CrossCorrelation(y2, y1); // Y2 * Y1
var r11 = StatsHelper.CrossCorrelation(y1, y1); // Y1 * Y1 autocorrelation

StatsHelper:

public static class StatsHelper
{
    public static LagCorr CrossCorrelation(double[] x1, double[] x2)
    {
        if (x1.Length != x2.Length)
            throw new Exception("Samples must have same size.");

        var len = x1.Length;
        var len2 = 2 * len;
        var len3 = 3 * len;
        var s1 = new double[len3];
        var s2 = new double[len3];
        var cor = new double[len2];
        var lag = new double[len2];

        Array.Copy(x1, 0, s1, len, len);
        Array.Copy(x2, 0, s2, 0, len);

        for (int i = 0; i < len2; i++)
        {
            cor[i] = Correlation.Pearson(s1, s2);
            lag[i] = i - len;
            Array.Copy(s2,0,s2,1,s2.Length-1);
            s2[0] = 0;
        }

        return new LagCorr { Corr = cor, Lag = lag };
    }
}

LagCorr:

public class LagCorr
{
    public double[] Lag { get; set; }
    public double[] Corr { get; set; }
}

EDIT: Adding Matlab comparison results:

clear;
step=0.02;
t=[0:step:1-step];
y1=ones(1,50);
y1(26:50)=0;
y2=[1-2*t];
y2(26:50)=0;

[cor12,lags12]=xcorr(y1,y2);
[cor21,lags21]=xcorr(y2,y1);
[cor11,lags11]=xcorr(y1,y1);
[cor22,lags22]=xcorr(y2,y2);

subplot(2,3,1);
plot(t,y1);
title('Y1');
axis([0 1 -0.5 1.5]);

subplot(2,3,2);
plot(lags12,cor12);
title('Y1*Y2');
axis([-30 30 0 15]);

subplot(2,3,3);
plot(lags11,cor11);
title('Y1*Y1');
axis([-30 30 0 30]);

subplot(2,3,4);
plot(t,y2);
title('Y2');
axis([0 1 -0.5 1.5]);

subplot(2,3,5);
plot(lags21,cor21);
title('Y2*Y1');
axis([-30 30 0 15]);

subplot(2,3,6);
plot(lags22,cor22);
title('Y2*Y2');
axis([-30 30 0 10]);

enter image description here

like image 107
jsanalytics Avatar answered Sep 19 '22 23:09

jsanalytics


I have tried the above solution with a sine wave that was shifted backwards by 20 time units with respect to a first sine wave. It gave me the correct result that the maximum of the correlation is at -20 (see below). One could discuss whether its appropriate to apply a zero padding, the zeros are not usually part of the signal. The MATLAB cross-correlation is not normalized the same way, it's not a "Pearson correlation" as in the example above.

The definition of the MATLAB cross-correlation is different: for scaling option "none" its a convolution with the time reversed signal. There are also various scaling options but none of them gives the same result as the Pearson correlation: matlab definition of xcorr

My result: cross correlation of sin(n*0.1) with sin(n*0.1 - 20*0.1) using the example above:

enter image description here

like image 41
Jens Kluge Avatar answered Sep 19 '22 23:09

Jens Kluge