Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparison between lm() and cor() - why the results are not corresponding

Tags:

r

correlation

lm

I am running an analysis for my dataset, example data frame df:

   DOY     species replicate position      SLA      PAR      temp    A1200    susPQ    susNPQ      FvFm      
1:  46 LINGONBERRY         1      LOW 65.75638 19.70906 -2.850833 0.569690 2.147297 0.9321771 0.5263661 562.1016 0.011440    28.02628
2:  46 LINGONBERRY         2      LOW 59.45028 19.78096 -2.850833 0.893840 2.511543 1.0516496 0.5503916 533.6136 0.028930    25.40703
3:  46 LINGONBERRY         3      LOW 59.51058 17.52833 -2.850833 0.731765 2.278927 1.0678274 0.5242824 549.6316 0.020185    46.16188
4:  46        PINE         1      LOW 35.90156 20.85151 -2.850833 1.518910 2.319431 2.2168853 0.4189484 392.6067 0.059280    47.79101
5:  46        PINE         1      TOP 27.29495 90.27197 -2.850833 1.780420 1.739912 1.5691443 0.4037803 418.3636 0.032890    59.03595
6:  46        PINE         2      LOW 34.38626 27.86268 -2.850833 1.959910

I want to check correlations between my variables, so I use cor() and rcor() functions.

cor.df <-corrplot(cor(df[,c(1,5:19)], method = "pearson"), method = "number", type = "upper", 
                  tl.col = "black")   

df_matrix <- as.matrix(df[,c(1,5:19)])
rcor.df <- rcorr(df_matrix, type ="pearson")

cor.df_R <-corrplot(rcor.df$r, method = "number", type = "upper", tl.col = "black", 
                    p.mat = rcor.df$P, sig.level = 0.001) 

r-correlation plot for my variables with "unsignificant" correlations crossed

As the results were not what I expected, I wanted to double-check that the method is working. To test that, I wanted to simply put some variables pairs into lm(). My assumption was that R2 from lm() summary should be the same as values delivered in my corplot. But, that is not the case.

1/ What is wrong with my assumption? Why lm() R2 does not correspond to rcor() or cor() r? 2/ What would be a suitable test to check that my correlations delivered in corplot are trustworthy? It is always good to run a backup test in case our results show something contradicting to the literature (as it is the case here).

like image 826
Paulina Avatar asked Sep 01 '25 20:09

Paulina


1 Answers

The R² of a simple linear regression is not the correlation but the square of the correlation between the independent variable and the dependent variable.

x <- 1:5
y <- rnorm(5)

cor(x, y)^2
# 0.001016668
summary(lm(y ~ x))$r.squared
# 0.001016668
like image 150
Stéphane Laurent Avatar answered Sep 04 '25 12:09

Stéphane Laurent