Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between cor and cor.test in R

Tags:

r

correlation

I have a data frame that its columns are different samples of an experiment. I wanted to find the correlation between these samples. So the correlation between sample v2 and v3, between sample v2 and v4, .... This is the data frame:

> head(t1)
      V2          V3          V4         V5         V6
1 0.12725011 0.051021886 0.106049328 0.09378767 0.17799444
2 0.86096784 1.263327211 3.073650624 0.75607466 0.92244361
3 0.45791031 0.520207274 1.526476608 0.67499102 0.49817761
4 0.00000000 0.001139721 0.003158557 0.00000000 0.00000000
5 0.13383965 0.098943019 0.099922146 0.13871867 0.09750611
6 0.01016334 0.010187671 0.025410170 0.00000000 0.02369374
> nrow(t1)
[1] 23367

if I run the cor function for this data frame to get the correlation between samples(columns) I get NA for all the samples:

> cor(t1, method= "spearman")
V2 V3 V4 V5 V6
V2  1 NA NA NA NA
V3 NA  1 NA NA NA
V4 NA NA  1 NA NA
V5 NA NA NA  1 NA
V6 NA NA NA NA  1

but if I run this :

> cor.test(t1[,1],t1[,2], method="spearman")$estimate
rho 
0.92394 

it is different. Why is this so? What is the correct way of getting correlation between these samples? Thank you in advance.

like image 489
hora Avatar asked Feb 02 '13 10:02

hora


People also ask

What does Cor test mean in R?

Correlation is when you are looking to determine the strength of the relationship between two numerical variables. R can carry out correlation via the cor() command, and there are three different sorts: Pearson correlation – for where data are normally distributed.

What does Cor () do in R?

cor() function in R Language is used to measure the correlation coefficient value between two vectors.

What is the default method for COR () in R?

The cor function The default method is Pearson, but you can also compute Spearman or Kendall coefficients.


1 Answers

Your data contains NA values.

From ?cor:

If use is "everything", NAs will propagate conceptually, i.e., a resulting value will be NA whenever one of its contributing observations is NA.

From ?cor.test

na.action a function which indicates what should happen when the data contain NAs. Defaults to getOption("na.action").

On my system:

getOption("na.action")
[1] "na.omit"

Use which(!is.finite(t1)) to search for problematic values and which(is.na(t1)) to search for NA values. cor returns NaN if you have Inf values in your data.

like image 53
Roland Avatar answered Oct 08 '22 06:10

Roland