Having two arrays of double values, I want to compute correlation coefficient (single double value, just like the CORREL function in MS Excel). Is there some simple one-line solution in C#?
I already discovered math lib called Meta Numerics. According to this SO question, it should do the job. Here is docs for Meta Numerics correlation method, which I don't get.
Could pls somebody provide me with simple code snippet or example how to use the library?
Note: At the end, I was forced to use one of custom implementations. But if someone reading this question knows good, well documented C# math library/framework to do this, please don't hesitate and post a link in answer.
You can have the values in separate lists at the same index and use a simple Zip
.
var fitResult = new FitResult(); var values1 = new List<int>(); var values2 = new List<int>(); var correls = values1.Zip(values2, (v1, v2) => fitResult.CorrelationCoefficient(v1, v2));
A second way is to write your own custom implementation (mine isn't optimized for speed):
public double ComputeCoeff(double[] values1, double[] values2) { if(values1.Length != values2.Length) throw new ArgumentException("values must be the same length"); var avg1 = values1.Average(); var avg2 = values2.Average(); var sum1 = values1.Zip(values2, (x1, y1) => (x1 - avg1) * (y1 - avg2)).Sum(); var sumSqr1 = values1.Sum(x => Math.Pow((x - avg1), 2.0)); var sumSqr2 = values2.Sum(y => Math.Pow((y - avg2), 2.0)); var result = sum1 / Math.Sqrt(sumSqr1 * sumSqr2); return result; }
Usage:
var values1 = new List<double> { 3, 2, 4, 5 ,6 }; var values2 = new List<double> { 9, 7, 12 ,15, 17 }; var result = ComputeCoeff(values1.ToArray(), values2.ToArray()); // 0.997054485501581 Debug.Assert(result.ToString("F6") == "0.997054");
Another way is to use the Excel function directly:
var values1 = new List<double> { 3, 2, 4, 5 ,6 }; var values2 = new List<double> { 9, 7, 12 ,15, 17 }; // Make sure to add a reference to Microsoft.Office.Interop.Excel.dll // and use the namespace var application = new Application(); var worksheetFunction = application.WorksheetFunction; var result = worksheetFunction.Correl(values1.ToArray(), values2.ToArray()); Console.Write(result); // 0.997054485501581
Math.NET Numerics is a well-documented math library that contains a Correlation class. It calculates Pearson and Spearman ranked correlations: http://numerics.mathdotnet.com/api/MathNet.Numerics.Statistics/Correlation.htm
The library is available under the very liberal MIT/X11 license. Using it to calculate a correlation coefficient is as easy as follows:
using MathNet.Numerics.Statistics; ... correlation = Correlation.Pearson(arrayOfValues1, arrayOfValues2);
Good luck!
In order to calculate Pearson product-moment correlation coefficient
http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient
You can use this simple code:
public static Double Correlation(Double[] Xs, Double[] Ys) {
Double sumX = 0;
Double sumX2 = 0;
Double sumY = 0;
Double sumY2 = 0;
Double sumXY = 0;
int n = Xs.Length < Ys.Length ? Xs.Length : Ys.Length;
for (int i = 0; i < n; ++i) {
Double x = Xs[i];
Double y = Ys[i];
sumX += x;
sumX2 += x * x;
sumY += y;
sumY2 += y * y;
sumXY += x * y;
}
Double stdX = Math.Sqrt(sumX2 / n - sumX * sumX / n / n);
Double stdY = Math.Sqrt(sumY2 / n - sumY * sumY / n / n);
Double covariance = (sumXY / n - sumX * sumY / n / n);
return covariance / stdX / stdY;
}
If you don't want to use a third party library, you can use the method from this post (posting code here for backup).
public double Correlation(double[] array1, double[] array2)
{
double[] array_xy = new double[array1.Length];
double[] array_xp2 = new double[array1.Length];
double[] array_yp2 = new double[array1.Length];
for (int i = 0; i < array1.Length; i++)
array_xy[i] = array1[i] * array2[i];
for (int i = 0; i < array1.Length; i++)
array_xp2[i] = Math.Pow(array1[i], 2.0);
for (int i = 0; i < array1.Length; i++)
array_yp2[i] = Math.Pow(array2[i], 2.0);
double sum_x = 0;
double sum_y = 0;
foreach (double n in array1)
sum_x += n;
foreach (double n in array2)
sum_y += n;
double sum_xy = 0;
foreach (double n in array_xy)
sum_xy += n;
double sum_xpow2 = 0;
foreach (double n in array_xp2)
sum_xpow2 += n;
double sum_ypow2 = 0;
foreach (double n in array_yp2)
sum_ypow2 += n;
double Ex2 = Math.Pow(sum_x, 2.00);
double Ey2 = Math.Pow(sum_y, 2.00);
return (array1.Length * sum_xy - sum_x * sum_y) /
Math.Sqrt((array1.Length * sum_xpow2 - Ex2) * (array1.Length * sum_ypow2 - Ey2));
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With