Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding group-specific text/data to faceted plot in R/ggplot2

Tags:

r

ggplot2

I am comparing the intra-group correlation between duplicate samples within a large gene expression experiment, where I have multiple separate biological groups - the idea being to see if any of the groups is much less well-correlated than the others, indicating a potential sample mixup or other error.

I am using ggplot to plot the expression values of each duplicate pair against each other. I would like to also be able to add the correlation coefficient and p-value to each panel of the plot, which I obtain through summarize and cor.test. You can use this code to get the general idea: in exp1, the duplicates are correlated, but not in exp2.

library(tidyverse)

df <- data.frame(exp=c(rep('exp1', 100), rep('exp2', 100)), a=rnorm(200, 1000, 200))
df <- mutate(df, b=ifelse(exp=='exp1', a*rnorm(100,1,0.05), rnorm(100, 1000, 200)))
head(df)
tail(df)

df %>% ggplot(aes(x=a, y=b))+
  geom_point() +
  facet_wrap(~exp)

group_by(df, exp) %>% 
  summarize(corr=cor.test(a,b)$estimate, pval=cor.test(a,b)$p.value)

This is the plot I generated via ggplot, and I've manually added the R and p-values that I got at the end. But of course, if I have a lot of sample pairs to analyze, it would be nice to be able to add these automatically from within the ggplot call. I'm just not sure how to do it.

enter image description here

like image 281
C. Murtaugh Avatar asked Jun 18 '26 17:06

C. Murtaugh


1 Answers

If, for whatever reason, you want to build this yourself instead of using the ggpubr functions, you can create your summary data, format labels, and place the labels with geom_text.

I'm formatting the stats so that R has a fixed 3 significant digits and p has 3 digits, falling back on scientific notation. I changed the names of those columns in summarise to R and p to make the labels below. Reshaping to long data and creating a new column with unite gets this:

library(tidyverse)
...

group_by(df, exp) %>% 
  summarize(R = cor.test(a, b)$estimate, p = cor.test(a, b)$p.value) %>%
  mutate(R = formatC(R, format = "fg", digits = 3),
         p = formatC(p, format = "g", digits = 3)) %>%
  gather(key = measure, value = value, -exp) %>%
  unite("stat", measure, value, sep = " = ")
#> # A tibble: 4 x 2
#>   exp   stat        
#>   <chr> <chr>       
#> 1 exp1  R = 0.965   
#> 2 exp2  R = 0.0438  
#> 3 exp1  p = 1.14e-58
#> 4 exp2  p = 0.665

Then for each of the groups, I want to collapse both labels, separated by a newline \n. This is a place that will scale well—you might have more summary stats to display, but this should still work.

summ <- group_by(df, exp) %>% 
  summarize(R = cor.test(a, b)$estimate, p = cor.test(a, b)$p.value) %>%
  mutate(R = formatC(R, format = "fg", digits = 3),
         p = formatC(p, format = "g", digits = 3)) %>%
  gather(key = measure, value = value, -exp) %>%
  unite("stat", measure, value, sep = " = ") %>%
  group_by(exp) %>%
  summarise(both_stats = paste(stat, collapse = "\n"))

summ
#> # A tibble: 2 x 2
#>   exp   both_stats               
#>   <chr> <chr>                    
#> 1 exp1  "R = 0.965\np = 1.14e-58"
#> 2 exp2  "R = 0.0438\np = 0.665"

In geom_text, I'm setting the x coordinate to -Inf, which gets the minimum of all x values, and the y coordinate as Inf for the maximum of all y values. That puts the label in the top-left corner, regardless of the values in the data.

The one thing I don't like here is then hacking the hjust and vjust outside their intended ranges of 0 to 1. But nudge_x/nudge_y won't do anything because of the values being set to infinity.

df %>% 
  ggplot(aes(x = a, y = b)) +
  geom_point() +
  geom_text(aes(x = -Inf, y = Inf, label = both_stats), data = summ, 
            hjust = -0.1, vjust = 1.1, lineheight = 1) +
  facet_wrap(~ exp)

Created on 2018-11-14 by the reprex package (v0.2.1)

like image 145
camille Avatar answered Jun 21 '26 07:06

camille



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!