I can get correlation matrix using following commands: <pre class="prettyprint"><code>> df<-data.frame(x=c(5,6,5,9,4,2,1,3,5,7),y=c(3.1,2.5,3.8,5.4,6.5,2.5,1.5,8.1,7.1,6.1),z=c(5,6,4,9,2,4,1,6,2,4)) > cor(df) x y z x 1.0000000 0.2923939 0.6566866 y 0.2923939 1.0000000 0.1167084 z 0.6566866 0.1167084 1.0000000 > </code></pre> I can get individual p-values using command: <pre class="prettyprint"><code>> cor.test(x,y)$p.value [1] 0.4123234 </code></pre> How can I get a matrix of p-values for all these correlation coefficients? Thanks for your help.

You can also use the package <code>Hmisc</code>. An example of how it works: <pre class="prettyprint"><code>mycor <- rcorr(as.matrix(data), type="pearson") </code></pre> <code>mycor$r</code> shows the correlation matrix, <code>mycor$p</code> the matrix with corresponding p-values.

This example calculates the p value for each of the column combinations. It is not an optimal solution (<code>x-y</code> and <code>y-x</code> p values are both calculated for example), but should provide some inspiration for you. The main trick is to use <code>expand.grid</code> to generate the combinations of columns, and use <code>mapply</code> to call <code>cor.test</code> on each combination: <pre class="prettyprint"><code>col_combinations = expand.grid(names(df), names(df)) cor_test_wrapper = function(col_name1, col_name2, data_frame) { cor.test(data_frame[[col_name1]], data_frame[[col_name2]])$p.value } p_vals = mapply(cor_test_wrapper, col_name1 = col_combinations[[1]], col_name2 = col_combinations[[2]], MoreArgs = list(data_frame = df)) matrix(p_vals, 3, 3, dimnames = list(names(df), names(df))) x y z x 0.00000000 0.4123234 0.03914453 y 0.41232343 0.0000000 0.74814951 z 0.03914453 0.7481495 0.00000000 </code></pre>

one way is to use <code>corr.test</code> (notice the double r) from package <code>psych</code> .. or if you're a fan of <code>mapply</code> and <code>sapply</code> you could write your own function doing this. something like: <pre class="prettyprint"><code>rrapply <- function(A, FUN, ...) mapply(function(a, B) lapply(B, function(x) FUN(a, x, ...)), a = A, MoreArgs = list(B = A)) cor.tests <- rrapply(df, cor.test) # a matrix of cor.tests apply(cor.tests, 1:2, function(x) x[[1]]$p.value) # and it's there </code></pre> And now you can use the same logic to make a matrix of t-tests or, say, CI's of correlations

Creating correlation matrix p values [duplicate]

Tags:

r

I can get correlation matrix using following commands:

> df<-data.frame(x=c(5,6,5,9,4,2,1,3,5,7),y=c(3.1,2.5,3.8,5.4,6.5,2.5,1.5,8.1,7.1,6.1),z=c(5,6,4,9,2,4,1,6,2,4))
> cor(df)
           x         y        z
x  1.0000000 0.2923939 0.6566866
y  0.2923939 1.0000000 0.1167084
z 0.6566866 0.1167084 1.0000000
>

I can get individual p-values using command:

> cor.test(x,y)$p.value
[1] 0.4123234

How can I get a matrix of p-values for all these correlation coefficients? Thanks for your help.

531

asked Apr 14 '14 07:04

rnso

3 Answers

You can also use the package Hmisc.

An example of how it works:

mycor <- rcorr(as.matrix(data), type="pearson")

mycor$r shows the correlation matrix, mycor$p the matrix with corresponding p-values.

107

answered Oct 26 '22 00:10

erc

This example calculates the p value for each of the column combinations. It is not an optimal solution (x-y and y-x p values are both calculated for example), but should provide some inspiration for you. The main trick is to use expand.grid to generate the combinations of columns, and use mapply to call cor.test on each combination:

col_combinations = expand.grid(names(df), names(df))
cor_test_wrapper = function(col_name1, col_name2, data_frame) {
    cor.test(data_frame[[col_name1]], data_frame[[col_name2]])$p.value
}
p_vals = mapply(cor_test_wrapper, 
                  col_name1 = col_combinations[[1]], 
                  col_name2 = col_combinations[[2]], 
                  MoreArgs = list(data_frame = df))
matrix(p_vals, 3, 3, dimnames = list(names(df), names(df)))
           x         y          z
x 0.00000000 0.4123234 0.03914453
y 0.41232343 0.0000000 0.74814951
z 0.03914453 0.7481495 0.00000000

answered Oct 25 '22 23:10

Paul Hiemstra

one way is to use corr.test (notice the double r) from package psych

.. or if you're a fan of mapply and sapply you could write your own function doing this. something like:

rrapply <- function(A, FUN, ...) mapply(function(a, B) lapply(B, 
         function(x) FUN(a, x, ...)), a = A, MoreArgs = list(B = A))
cor.tests <- rrapply(df, cor.test) # a matrix of cor.tests
apply(cor.tests, 1:2, function(x) x[[1]]$p.value) # and it's there

And now you can use the same logic to make a matrix of t-tests or, say, CI's of correlations

answered Oct 26 '22 00:10

lebatsnok

Related questions
                            
                                How to define multiple variables with lapply?
                            
                                when to use map() function and when to use summarise_at()/mutate_at()
                            
                                R CRAN Check fail when using parallel functions
                            
                                Getting the unique count of strings from a text string
                            
                                Extracting column names with condition from a data frame
                            
                                Assigning a specific number of values informed by a probability distribution (in R)
                            
                                Loop through a series of qplots
                            
                                Observation number by group [duplicate]
                            
                                Error writing to csv
                            
                                Why the "=" R operator should not be used in functions?
                            
                                remove comma from a digits portion string
                            
                                Overlapped density plots in ggplot2
                            
                                Is there a better syntax for subsetting a data frame in R?
                            
                                Selecting data frame columns to plot in ggplot2
                            
                                Plot 3d density
                            
                                How to change default aesthetics in ggplot?
                            
                                Efficient way to create a circulant matrix in R
                            
                                Controlling legend and colors for raster values in R?
                            
                                Uppercase the first letter in data frame
                            
                                What is the difference between string and character in R?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With