I have data that contain 54 samples for each condition (x and y). I have computed the correlation the following way: <pre class="prettyprint"><code>> dat <- read.table("http://dpaste.com/1064360/plain/",header=TRUE) > cor(dat$x,dat$y) [1] 0.2870823 </code></pre> Is there a native way to produce SE of correlation in R's cor() functions above and p-value from T-test? As explained in this web (page 14.6)

I think that what you're looking for is simply the <code>cor.test()</code> function, which will return everything you're looking for except for the standard error of correlation. However, as you can see, the formula for that is very straightforward, and if you use <code>cor.test</code>, you have all the inputs required to calculate it. Using the data from the example (so you can compare it yourself with the results on page 14.6): <pre class="prettyprint"><code>> cor.test(mydf$X, mydf$Y) Pearson's product-moment correlation data: mydf$X and mydf$Y t = -5.0867, df = 10, p-value = 0.0004731 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: -0.9568189 -0.5371871 sample estimates: cor -0.8492663 </code></pre> If you wanted to, you could also create a function like the following to include the standard error of the correlation coefficient. For convenience, here's the equation: <img src="https://i.stack.imgur.com/l7Sqn.png" alt="enter image description here"> r = the correlation estimate and n - 2 = degrees of freedom, both of which are readily available in the output above. Thus, a simple function could be: <pre class="prettyprint"><code>cor.test.plus <- function(x) { list(x, Standard.Error = unname(sqrt((1 - x$estimate^2)/x$parameter))) } </code></pre> And use it as follows: <pre class="prettyprint"><code>cor.test.plus(cor.test(mydf$X, mydf$Y)) </code></pre> Here, "mydf" is defined as: <pre class="prettyprint"><code>mydf <- structure(list(Neighborhood = c("Fair Oaks", "Strandwood", "Walnut Acres", "Discov. Bay", "Belshaw", "Kennedy", "Cassell", "Miner", "Sedgewick", "Sakamoto", "Toyon", "Lietz"), X = c(50L, 11L, 2L, 19L, 26L, 73L, 81L, 51L, 11L, 2L, 19L, 25L), Y = c(22.1, 35.9, 57.9, 22.2, 42.4, 5.8, 3.6, 21.4, 55.2, 33.3, 32.4, 38.4)), .Names = c("Neighborhood", "X", "Y"), class = "data.frame", row.names = c(NA, -12L)) </code></pre>

Can't you simply take the test statistic from the return value? Of course the test statistic is the estimate/se so you can calc se from just dividing the estimate by the tstat: Using <code>mydf</code> in the answer above: <pre class="prettyprint"><code>r = cor.test(mydf$X, mydf$Y) tstat = r$statistic estimate = r$estimate estimate; tstat cor -0.8492663 t -5.086732 </code></pre>

How to compute P-value and standard error from correlation analysis of R's cor()

Tags:

r

correlation

I have data that contain 54 samples for each condition (x and y). I have computed the correlation the following way:

> dat <- read.table("http://dpaste.com/1064360/plain/",header=TRUE)
> cor(dat$x,dat$y)
[1] 0.2870823

Is there a native way to produce SE of correlation in R's cor() functions above and p-value from T-test?

As explained in this web (page 14.6)

910

asked Apr 19 '13 04:04

neversaint

2 Answers

I think that what you're looking for is simply the cor.test() function, which will return everything you're looking for except for the standard error of correlation. However, as you can see, the formula for that is very straightforward, and if you use cor.test, you have all the inputs required to calculate it.

Using the data from the example (so you can compare it yourself with the results on page 14.6):

> cor.test(mydf$X, mydf$Y)

    Pearson's product-moment correlation

data:  mydf$X and mydf$Y
t = -5.0867, df = 10, p-value = 0.0004731
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 -0.9568189 -0.5371871
sample estimates:
       cor 
-0.8492663

If you wanted to, you could also create a function like the following to include the standard error of the correlation coefficient.

For convenience, here's the equation:

enter image description here

r = the correlation estimate and n - 2 = degrees of freedom, both of which are readily available in the output above. Thus, a simple function could be:

cor.test.plus <- function(x) {
  list(x, 
       Standard.Error = unname(sqrt((1 - x$estimate^2)/x$parameter)))
}

And use it as follows:

cor.test.plus(cor.test(mydf$X, mydf$Y))

Here, "mydf" is defined as:

mydf <- structure(list(Neighborhood = c("Fair Oaks", "Strandwood", "Walnut Acres",
  "Discov. Bay", "Belshaw", "Kennedy", "Cassell", "Miner", "Sedgewick", 
  "Sakamoto", "Toyon", "Lietz"), X = c(50L, 11L, 2L, 19L, 26L, 
  73L, 81L, 51L, 11L, 2L, 19L, 25L), Y = c(22.1, 35.9, 57.9, 22.2, 
  42.4, 5.8, 3.6, 21.4, 55.2, 33.3, 32.4, 38.4)), .Names = c("Neighborhood", 
  "X", "Y"), class = "data.frame", row.names = c(NA, -12L))

139

answered Oct 31 '22 22:10

A5C1D2H2I1M1N2O1R2T1

Can't you simply take the test statistic from the return value? Of course the test statistic is the estimate/se so you can calc se from just dividing the estimate by the tstat:

Using mydf in the answer above:

r = cor.test(mydf$X, mydf$Y)
tstat = r$statistic
estimate = r$estimate
estimate; tstat

       cor 
-0.8492663 
        t 
-5.086732

answered Oct 31 '22 20:10

Alex

Related questions
                            
                                Overlay bar graphs in ggplot2 [duplicate]
                            
                                How to fix the error in R of "no lines available in input"?
                            
                                Cumulative sequence of occurrences of values [duplicate]
                            
                                geom_smooth on a subset of data
                            
                                Quicker way to read single column of CSV file
                            
                                Calculating Entropy
                            
                                Select the last n columns of data frame in R
                            
                                Insert a linebreak in title
                            
                                Changing the dataset of a ggplot object
                            
                                R How to mutate a subset of rows
                            
                                Reading objects from shiny output object not allowed?
                            
                                If Column Contains String then enter value for that row
                            
                                How to customize hover information in ggplotly object?
                            
                                Add new variable to list of data frames with purrr and mutate() from dplyr
                            
                                R subtracting 1 month from today's date gives NA
                            
                                Can't download data from Yahoo Finance using Quantmod in R
                            
                                Unable to send email using mailR package
                            
                                Sweave xtable: how to position tables between text?
                            
                                Replace values in a vector based on another vector
                            
                                Making a series of plots that proceed by a click

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With