Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to compute P-value and standard error from correlation analysis of R's cor()

Tags:

r

correlation

I have data that contain 54 samples for each condition (x and y). I have computed the correlation the following way:

> dat <- read.table("http://dpaste.com/1064360/plain/",header=TRUE)
> cor(dat$x,dat$y)
[1] 0.2870823

Is there a native way to produce SE of correlation in R's cor() functions above and p-value from T-test?

As explained in this web (page 14.6)

like image 910
neversaint Avatar asked Apr 19 '13 04:04

neversaint


People also ask

How do you calculate standard error of correlation coefficient in r?

The first one uses the formula SE=sqrt ((1-r^2)/(n-2)) which results in a symmetrical confidence interval. The second is based on the Fisher's r-to-z transformation and uses the formula SE=1/sqrt(n-3). This approach results in asymmetical CIs.

How do you find the p-value in a correlation test in r?

Pearson correlation formula The p-value (significance level) of the correlation can be determined : by using the correlation coefficient table for the degrees of freedom : df=n−2, where n is the number of observation in x and y variables.

How do you find p-value from correlation?

The p-value is calculated using a t-distribution with n−2 degrees of freedom. The formula for the test statistic is t=r√n−2√1−r2. The value of the test statistic, t, is shown in the computer or calculator output along with the p-value. The test statistic t has the same sign as the correlation coefficient r.


2 Answers

I think that what you're looking for is simply the cor.test() function, which will return everything you're looking for except for the standard error of correlation. However, as you can see, the formula for that is very straightforward, and if you use cor.test, you have all the inputs required to calculate it.

Using the data from the example (so you can compare it yourself with the results on page 14.6):

> cor.test(mydf$X, mydf$Y)

    Pearson's product-moment correlation

data:  mydf$X and mydf$Y
t = -5.0867, df = 10, p-value = 0.0004731
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 -0.9568189 -0.5371871
sample estimates:
       cor 
-0.8492663 

If you wanted to, you could also create a function like the following to include the standard error of the correlation coefficient.

For convenience, here's the equation:

enter image description here

r = the correlation estimate and n - 2 = degrees of freedom, both of which are readily available in the output above. Thus, a simple function could be:

cor.test.plus <- function(x) {
  list(x, 
       Standard.Error = unname(sqrt((1 - x$estimate^2)/x$parameter)))
}

And use it as follows:

cor.test.plus(cor.test(mydf$X, mydf$Y))

Here, "mydf" is defined as:

mydf <- structure(list(Neighborhood = c("Fair Oaks", "Strandwood", "Walnut Acres",
  "Discov. Bay", "Belshaw", "Kennedy", "Cassell", "Miner", "Sedgewick", 
  "Sakamoto", "Toyon", "Lietz"), X = c(50L, 11L, 2L, 19L, 26L, 
  73L, 81L, 51L, 11L, 2L, 19L, 25L), Y = c(22.1, 35.9, 57.9, 22.2, 
  42.4, 5.8, 3.6, 21.4, 55.2, 33.3, 32.4, 38.4)), .Names = c("Neighborhood", 
  "X", "Y"), class = "data.frame", row.names = c(NA, -12L))
like image 139
A5C1D2H2I1M1N2O1R2T1 Avatar answered Oct 31 '22 22:10

A5C1D2H2I1M1N2O1R2T1


Can't you simply take the test statistic from the return value? Of course the test statistic is the estimate/se so you can calc se from just dividing the estimate by the tstat:

Using mydf in the answer above:

r = cor.test(mydf$X, mydf$Y)
tstat = r$statistic
estimate = r$estimate
estimate; tstat

       cor 
-0.8492663 
        t 
-5.086732 
like image 29
Alex Avatar answered Oct 31 '22 20:10

Alex