I have a matrix x
(30x2000) of 2000 gene expressions in different cell lines and a vector y
(30x1) of a continuous variable outcome. I want to calculate Pearson correlation between each gene and the outcome, so, I expect a 2000x1 vector of r-values. I've used rcorr(x,y)
but the result is a 2000x2000 matrix, so I guess it's ignoring the y
and calculating all genes against all (the manual says:
x = a numeric matrix with at least 5 rows and at least 2 columns (if y is absent)
But can I have more than one column and have y
too? Do I have to use a different function?
The NA can actually be due to 2 reasons. One is that there is a NA in your data. Another one is due to there being one of the values being constant. This results in standard deviation being equal to zero and hence the cor function returns NA.
In this method, the user has to call the cor() function and then within this function the user has to pass the name of the multiple variables in the form of vector as its parameter to get the correlation among multiple variables by specifying multiple column names in the R programming language.
In multiple linear regression, the correlation matrix determines the correlation coefficients between the independent variables in a model.
Using the function cor
will work. In general, if x
is MxN andy y
is MxP, then cor(x,y)
will be an NxP matrix where the entry (i,j) is the correlation between x[,i]
and y[,j]
.
Building on SimonO101's reproducible example:
> set.seed(1)
> x <- matrix( runif(12) , nrow = 3 )
> y <- runif(3)
> cor(x,y)
[,1]
[1,] 0.3712437
[2,] 0.9764443
[3,] 0.2249998
[4,] -0.4903723
If you want just a vector and not a matrix:
> array(cor(x,y))
[1] 0.3712437 0.9764443 0.2249998 -0.4903723
You need to apply
the cor
function across the columns of your x
matrix...
apply( x , 2 , cor , y = y )
# For reproducible data
set.seed(1)
# 3 x 4 matrix
x <- matrix( runif(12) , nrow = 3 )
# [,1] [,2] [,3] [,4]
#[1,] 0.2655087 0.9082078 0.9446753 0.06178627
#[2,] 0.3721239 0.2016819 0.6607978 0.20597457
#[3,] 0.5728534 0.8983897 0.6291140 0.17655675
# Length 3 vector
y <- runif(3)
#[1] 0.6870228 0.3841037 0.7698414
# Length 4 otuput vector
apply( x , 2 , cor , y = y )
#[1] 0.3712437 0.9764443 0.2249998 -0.4903723
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With