Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

put correlation coefficient on ggplot scatter plot after faceting

Tags:

r

ggplot2

I'm having issue to put correlation coefficient on my scatter plot after facet_wrap by another variable. Below is the example I made using mtcars dataset for illustration purpose. when I plot it out, both plot have the same correlation number. It seems the correlation coef is not calculated for each facet. I could not figure out a way to achieve that. Really appreciate it if anyone could kindly help with that...

library(ggplot2)
library(dplyr)
corr_eqn <- function(x,y, method='pearson', digits = 2) {
    corr_coef <- round(cor.test(x, y, method=method)$estimate, digits = digits)
    corr_pval <- tryCatch(format(cor.test(x,y, method=method)$p.value, 
                                 scientific=TRUE),
                          error=function(e) NA)
    paste(method, 'r = ', corr_coef, ',', 'pval =', corr_pval)
}

sca.plot <- function (cor.coef=TRUE) {
    df<- mtcars %>% filter(vs==1)
    p<- df %>% 
        ggplot(aes(x=hp, y=mpg))+
        geom_point()+
        geom_smooth()+
        facet_wrap(~cyl, ncol=3)

    if (cor.coef) {
        p<- p+geom_text(x=0.9*max(df$hp, na.rm=TRUE),
                        y=0.9*max(df$mpg, na.rm=TRUE),
                        label = corr_eqn(df[['hp']],df[['mpg']],
                                         method='pearson'))
    }
    return (p)    
}

sca.plot(cor.coef=TRUE)
like image 460
zesla Avatar asked Sep 15 '17 15:09

zesla


People also ask

How do you add correlation coefficient to ggplot2?

To add correlation coefficient with P-value to a scatter plot, we use the stat_cor() function of the ggpubr package in the R Language. The ggpubr package provides some easy-to-use functions for creating and customizing ggplot2 plots.

How do you find the correlation between two variables in R?

Use the function cor. test(x,y) to analyze the correlation coefficient between two variables and to get significance level of the correlation.


1 Answers

Call facets through variable inputFacet, loop over this variable to calculate corr_enq and plot facets using variable name with get.

In shiny you'll probably have user input as input$facet here it's called inputFacet. We plot main plot getting this variable in facet_wrap(~ get(inputFacet), ncol = 3). Next we loop over all facet options with for(i in seq_along(resCor$facets)) and store result in rescore.

This should solve "correlation coef is not calculated for each facet" problem.

library(dplyr)
library(ggplot2)

inputFacet <- "cyl"
cor.coef = TRUE
df <- mtcars

p <- df %>% 
    ggplot(aes(hp, mpg))+
    geom_point()+
    geom_smooth()+
    facet_wrap(~ get(inputFacet), ncol = 3)

if (cor.coef) {

    resCor <- data.frame(facets = unique(mtcars[, inputFacet]))
    for(i in seq_along(resCor$facets)) {
        foo <- mtcars[mtcars[, inputFacet] == resCor$facets[i], ]
        resCor$text[i] <- corr_eqn(foo$hp, foo$mpg)
    }
    colnames(resCor)[1] <- inputFacet

    p <- p + geom_text(data = resCor, 
                       aes(0.9 * max(df$hp, na.rm = TRUE),
                           0.9 * max(df$mpg, na.rm = TRUE),
                           label = text))

}

p

enter image description here

like image 162
pogibas Avatar answered Sep 21 '22 17:09

pogibas