Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Correlation of two arrays in C#

Having two arrays of double values, I want to compute correlation coefficient (single double value, just like the CORREL function in MS Excel). Is there some simple one-line solution in C#?

I already discovered math lib called Meta Numerics. According to this SO question, it should do the job. Here is docs for Meta Numerics correlation method, which I don't get.

Could pls somebody provide me with simple code snippet or example how to use the library?

Note: At the end, I was forced to use one of custom implementations. But if someone reading this question knows good, well documented C# math library/framework to do this, please don't hesitate and post a link in answer.

like image 478
teejay Avatar asked Jul 03 '13 12:07

teejay


4 Answers

You can have the values in separate lists at the same index and use a simple Zip.

var fitResult = new FitResult(); var values1 = new List<int>(); var values2 = new List<int>();  var correls = values1.Zip(values2, (v1, v2) =>                                        fitResult.CorrelationCoefficient(v1, v2)); 

A second way is to write your own custom implementation (mine isn't optimized for speed):

public double ComputeCoeff(double[] values1, double[] values2) {     if(values1.Length != values2.Length)         throw new ArgumentException("values must be the same length");      var avg1 = values1.Average();     var avg2 = values2.Average();      var sum1 = values1.Zip(values2, (x1, y1) => (x1 - avg1) * (y1 - avg2)).Sum();      var sumSqr1 = values1.Sum(x => Math.Pow((x - avg1), 2.0));     var sumSqr2 = values2.Sum(y => Math.Pow((y - avg2), 2.0));      var result = sum1 / Math.Sqrt(sumSqr1 * sumSqr2);      return result; } 

Usage:

var values1 = new List<double> { 3, 2, 4, 5 ,6 }; var values2 = new List<double> { 9, 7, 12 ,15, 17 };  var result = ComputeCoeff(values1.ToArray(), values2.ToArray()); // 0.997054485501581  Debug.Assert(result.ToString("F6") == "0.997054"); 

Another way is to use the Excel function directly:

var values1 = new List<double> { 3, 2, 4, 5 ,6 }; var values2 = new List<double> { 9, 7, 12 ,15, 17 };  // Make sure to add a reference to Microsoft.Office.Interop.Excel.dll // and use the namespace  var application = new Application();  var worksheetFunction = application.WorksheetFunction;  var result = worksheetFunction.Correl(values1.ToArray(), values2.ToArray());  Console.Write(result); // 0.997054485501581 
like image 184
Dustin Kingen Avatar answered Sep 30 '22 11:09

Dustin Kingen


Math.NET Numerics is a well-documented math library that contains a Correlation class. It calculates Pearson and Spearman ranked correlations: http://numerics.mathdotnet.com/api/MathNet.Numerics.Statistics/Correlation.htm

The library is available under the very liberal MIT/X11 license. Using it to calculate a correlation coefficient is as easy as follows:

using MathNet.Numerics.Statistics;  ...  correlation = Correlation.Pearson(arrayOfValues1, arrayOfValues2); 

Good luck!

like image 28
Ruben Ramirez Padron Avatar answered Sep 30 '22 12:09

Ruben Ramirez Padron


In order to calculate Pearson product-moment correlation coefficient

http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient

You can use this simple code:

  public static Double Correlation(Double[] Xs, Double[] Ys) {
    Double sumX = 0;
    Double sumX2 = 0;
    Double sumY = 0;
    Double sumY2 = 0;
    Double sumXY = 0;

    int n = Xs.Length < Ys.Length ? Xs.Length : Ys.Length;

    for (int i = 0; i < n; ++i) {
      Double x = Xs[i];
      Double y = Ys[i];

      sumX += x;
      sumX2 += x * x;
      sumY += y;
      sumY2 += y * y;
      sumXY += x * y;
    }

    Double stdX = Math.Sqrt(sumX2 / n - sumX * sumX / n / n);
    Double stdY = Math.Sqrt(sumY2 / n - sumY * sumY / n / n);
    Double covariance = (sumXY / n - sumX * sumY / n / n);

    return covariance / stdX / stdY; 
  }
like image 34
Dmitry Bychenko Avatar answered Sep 30 '22 11:09

Dmitry Bychenko


If you don't want to use a third party library, you can use the method from this post (posting code here for backup).

public double Correlation(double[] array1, double[] array2)
{
    double[] array_xy = new double[array1.Length];
    double[] array_xp2 = new double[array1.Length];
    double[] array_yp2 = new double[array1.Length];
    for (int i = 0; i < array1.Length; i++)
    array_xy[i] = array1[i] * array2[i];
    for (int i = 0; i < array1.Length; i++)
    array_xp2[i] = Math.Pow(array1[i], 2.0);
    for (int i = 0; i < array1.Length; i++)
    array_yp2[i] = Math.Pow(array2[i], 2.0);
    double sum_x = 0;
    double sum_y = 0;
    foreach (double n in array1)
        sum_x += n;
    foreach (double n in array2)
        sum_y += n;
    double sum_xy = 0;
    foreach (double n in array_xy)
        sum_xy += n;
    double sum_xpow2 = 0;
    foreach (double n in array_xp2)
        sum_xpow2 += n;
    double sum_ypow2 = 0;
    foreach (double n in array_yp2)
        sum_ypow2 += n;
    double Ex2 = Math.Pow(sum_x, 2.00);
    double Ey2 = Math.Pow(sum_y, 2.00);

    return (array1.Length * sum_xy - sum_x * sum_y) /
           Math.Sqrt((array1.Length * sum_xpow2 - Ex2) * (array1.Length * sum_ypow2 - Ey2));
}
like image 22
keyboardP Avatar answered Sep 30 '22 10:09

keyboardP