Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pairwise correlation from Dunnett's rank test

Tags:

r

ggplot2

I would like to represent the Dunnett's test results in a heatmap, highlighting the correlations between groups.

Output:

                           mean.rank.diff    pval    
EpisodeFourL-EpisodeFiveL      -51.418401 0.33175    
EpisodeOneL-EpisodeFiveL        38.505311 1.00000    
EpisodeSixL-EpisodeFiveL        34.267816 1.00000    
EpisodeThreeL-EpisodeFiveL     -68.548095 0.07237 .  
EpisodeTwoL-EpisodeFiveL       -93.324843 0.00504 ** 
EpisodeOneL-EpisodeFourL        89.923712 0.03094 *  
EpisodeSixL-EpisodeFourL        85.686217 0.12094    
EpisodeThreeL-EpisodeFourL     -17.129694 1.00000    
EpisodeTwoL-EpisodeFourL       -41.906442 0.60473    
EpisodeSixL-EpisodeOneL         -4.237495 1.00000    
EpisodeThreeL-EpisodeOneL     -107.053407 0.00484 ** 
EpisodeTwoL-EpisodeOneL       -131.830154 0.00024 ***
EpisodeThreeL-EpisodeSixL     -102.815911 0.03506 *  
EpisodeTwoL-EpisodeSixL       -127.592659 0.00484 ** 
EpisodeTwoL-EpisodeThreeL      -24.776748 1.00000 

How can I make a "correlation matrix of p values", so that it will look like the following, with the cells recording the mean rank diff coloured by the p-values?

Thanks for your time

enter image description here

I am struggling with the following steps:

  1. pairwise comparison - how to arrange my data to have on the 2 axes the episode names;
  2. how to split the episodes into 2 groups M and L;
  3. how to create a correlation heatmaps with mean rank diff values in the cells and p-values used to coloured the cells

Sample data:

df<-structure(list(mean.rank.diff = c(31.793661, 50.78439, -93.432344, 
-61.09784, -30.52092, -43.07989, 26.230952, 65.94858, 11.569245, 
20.41009, -125.226005, -111.88223, -62.31458, -93.86428, -5.562709, 
15.16419, -20.224416, -30.3743, 62.911425, 18.01795, 119.663297, 
127.04642, 105.00159, 81.50793, 56.751872, 109.02847, 42.090165, 
63.48998, -14.661707, -45.53849), pval = c(1, 0.43984, 0.03031, 
0.37802, 1, 1, 1, 0.1446, 1, 1, 0.00049, 0.00207, 0.85499, 0.10108, 
1, 1, 1, 1, 1, 1, 0.00098, 0.00033, 0.00782, 0.09761, 1, 0.03568, 
1, 0.60994, 1, 0.60994)), class = "data.frame", row.names = c("EpisodeFourL-EpisodeFiveL", 
"EpisodeFourM-EpisodeFiveM", "EpisodeOneL-EpisodeFiveL", "EpisodeOneM-EpisodeFiveM", 
"EpisodeSixL-EpisodeFiveL", "EpisodeSixM-EpisodeFiveM", "EpisodeThreeL-EpisodeFiveL", 
"EpisodeThreeM-EpisodeFiveM", "EpisodeTwoL-EpisodeFiveL", "EpisodeTwoM-EpisodeFiveM", 
"EpisodeOneL-EpisodeFourL", "EpisodeOneM-EpisodeFourM", "EpisodeSixL-EpisodeFourL", 
"EpisodeSixM-EpisodeFourM", "EpisodeThreeL-EpisodeFourL", "EpisodeThreeM-EpisodeFourM", 
"EpisodeTwoL-EpisodeFourL", "EpisodeTwoM-EpisodeFourM", "EpisodeSixL-EpisodeOneL", 
"EpisodeSixM-EpisodeOneM", "EpisodeThreeL-EpisodeOneL", "EpisodeThreeM-EpisodeOneM", 
"EpisodeTwoL-EpisodeOneL", "EpisodeTwoM-EpisodeOneM", "EpisodeThreeL-EpisodeSixL", 
"EpisodeThreeM-EpisodeSixM", "EpisodeTwoL-EpisodeSixL", "EpisodeTwoM-EpisodeSixM", 
"EpisodeTwoL-EpisodeThreeL", "EpisodeTwoM-EpisodeThreeM"))
like image 837
user11418708 Avatar asked Oct 28 '20 15:10

user11418708


1 Answers

Maybe this is what you are looking for

  1. Making use of dplyr, tidyr and stringr you can split your rownames into episodes and groups
  2. After the data wrangling you can get a heatmap via geom_tile, geom_text and facet_grid
  3. Finally, I made some adjustments to put the facet labels outside and to put the x-axis on the top.
library(ggplot2)
library(tidyr)
library(dplyr)

levels <- paste0("Episode", c("One", "Two", "Three", "Four", "Five", "Six"))
labels <- paste("Episode", c("One", "Two", "Three", "Four", "Five", "Six"))
df1 <- df %>% 
  mutate(episodes = row.names(.)) %>% 
  separate(episodes, into = c("episode1", "episode2")) %>% 
  mutate(type1 = stringr::str_extract(episode1, ".$"), 
         type2 = stringr::str_extract(episode1, ".$"),
         across(c(episode1, episode2), ~ stringr::str_remove(., ".$")),
         across(c(episode1, episode2), ~ factor(., levels = levels, labels = labels)),
         across(c(type1, type2), ~ factor(., levels = c("M", "L"))))

ggplot(df1, aes(type1, forcats::fct_rev(type2), fill = pval)) +
  geom_tile() +
  geom_text(aes(label = scales::number(mean.rank.diff, accuracy = .1))) +
  facet_grid(episode1 ~ episode2, switch = "y") +
  scale_x_discrete(position = "top") +
  theme(strip.placement = "outside") +
  labs(x = NULL, y = NULL)

like image 79
stefan Avatar answered Oct 04 '22 03:10

stefan