Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get correlation of two vectors in python [duplicate]

Tags:

python

numpy

In matlab I use

a=[1,4,6]
b=[1,2,3]
corr(a,b)

which returns .9934. I've tried numpy.correlate but it returns something completely different. What is the simplest way to get the correlation of two vectors?

like image 829
Luke Makk Avatar asked Oct 17 '13 13:10

Luke Makk


People also ask

How do you find the correlation coefficient between two vectors in Python?

correlate(a, v, mode='valid', old_behavior=False)[source] Cross-correlation of two 1-dimensional sequences. This function computes the correlation as generally defined in signal processing texts: z[k] = sum_n a[n] * conj(v[n+k]) with a and v sequences being zero-padded where necessary and conj being the conjugate.

How do you find the correlation between two vectors?

If X, Y are two random variables of zero mean, then the covariance Cov[XY ] = E[X · Y ] is the dot product of X and Y . The standard deviation of X is the length of X. The correlation is the cosine of the angle between the two vectors.

How does Numpy calculate correlation?

The Pearson Correlation coefficient can be computed in Python using corrcoef() method from Numpy. The input for this function is typically a matrix, say of size mxn , where: Each column represents the values of a random variable. Each row represents a single sample of n random variables.


1 Answers

The docs indicate that numpy.correlate is not what you are looking for:

numpy.correlate(a, v, mode='valid', old_behavior=False)[source]
  Cross-correlation of two 1-dimensional sequences.
  This function computes the correlation as generally defined in signal processing texts:
     z[k] = sum_n a[n] * conj(v[n+k])
  with a and v sequences being zero-padded where necessary and conj being the conjugate.

Instead, as the other comments suggested, you are looking for a Pearson correlation coefficient. To do this with scipy try:

from scipy.stats.stats import pearsonr   
a = [1,4,6]
b = [1,2,3]   
print(pearsonr(a,b))

This gives

(0.99339926779878274, 0.073186395040328034)

You can also use numpy.corrcoef:

import numpy
print(numpy.corrcoef(a,b))

This gives:

[[ 1.          0.99339927]
 [ 0.99339927  1.        ]]
like image 190
Hooked Avatar answered Oct 19 '22 20:10

Hooked