Following up on this question and for the sake of completeness, I modified the accepted answer and customized the resulting plot, but I am still facing some important problems. To sum up, I am doing boxplots reflecting significance of Kruskal-Wallis and pairwise Wilcoxon test comparisons. I want to replace the p-value numbers with asterisks, and show only the significant comparisons, reducing vertical spacing to the max. Basically I want to do this, but with the added problem of facets, that messes everything up. So far I have worked on a very decent MWE, but it still shows problems... <pre class="prettyprint"><code>library(reshape2) library(ggplot2) library(gridExtra) library(tidyverse) library(data.table) library(ggsignif) library(RColorBrewer) data(iris) iris$treatment <- rep(c("A","B"), length(iris$Species)/2) mydf <- melt(iris, measure.vars=names(iris)[1:4]) mydf$treatment <- as.factor(mydf$treatment) mydf$variable <- factor(mydf$variable, levels=sort(levels(mydf$variable))) mydf$both <- factor(paste(mydf$treatment, mydf$variable), levels=(unique(paste(mydf$treatment, mydf$variable)))) # Change data to reduce number of statistically significant differences set.seed(2) mydf <- mydf %>% mutate(value=rnorm(nrow(mydf))) ## ##FIRST TEST BOTH #Kruskal-Wallis addkw <- as.data.frame(mydf %>% group_by(Species) %>% summarize(p.value = kruskal.test(value ~ both)$p.value)) #addkw$p.adjust <- p.adjust(addkw$p.value, "BH") a <- combn(levels(mydf$both), 2, simplify = FALSE) #new p.values pv.final <- data.frame() for (gr in unique(mydf$Species)){ for (i in 1:length(a)){ tis <- a[[i]] #variable pair to test as <- subset(mydf, Species==gr & both %in% tis) pv <- wilcox.test(value ~ both, data=as)$p.value ddd <- data.table(as) asm <- as.data.frame(ddd[, list(value=mean(value)), by=list(both=both)]) asm2 <- dcast(asm, .~both, value.var="value")[,-1] pf <- data.frame(group1=paste(tis[1], gr), group2=paste(tis[2], gr), mean.group1=asm2[,1], mean.group2=asm2[,2], FC.1over2=asm2[,1]/asm2[,2], p.value=pv) pv.final <- rbind(pv.final, pf) } } #pv.final$p.adjust <- p.adjust(pv.final$p.value, method="BH") pv.final$map.signif <- ifelse(pv.final$p.value > 0.05, "", ifelse(pv.final$p.value > 0.01,"*", "**")) cols <- colorRampPalette(brewer.pal(length(unique(mydf$Species)), "Set1")) myPal <- cols(length(unique(mydf$Species))) #Function to get a list of plots to use as "facets" with grid.arrange plot.list=function(mydf, pv.final, addkw, a, myPal){ mylist <- list() i <- 0 for (sp in unique(mydf$Species)){ i <- i+1 mydf0 <- subset(mydf, Species==sp) addkw0 <- subset(addkw, Species==sp) pv.final0 <- pv.final[grep(sp, pv.final$group1), ] num.signif <- sum(pv.final0$p.value <= 0.05) P <- ggplot(mydf0,aes(x=both, y=value)) + geom_boxplot(aes(fill=Species)) + stat_summary(fun.y=mean, geom="point", shape=5, size=4) + facet_grid(~Species, scales="free", space="free_x") + scale_fill_manual(values=myPal[i]) + #WHY IS COLOR IGNORED? geom_text(data=addkw0, hjust=0, size=4.5, aes(x=0, y=round(max(mydf0$value, na.rm=TRUE)+0.5), label=paste0("KW p=",p.value))) + geom_signif(test="wilcox.test", comparisons = a[which(pv.final0$p.value<=0.05)],#I can use "a"here map_signif_level = F, vjust=0, textsize=4, size=0.5, step_increase = 0.05) if (i==1){ P <- P + theme(legend.position="none", axis.text.x=element_text(size=20, angle=90, hjust=1), axis.text.y=element_text(size=20), axis.title=element_blank(), strip.text.x=element_text(size=20,face="bold"), strip.text.y=element_text(size=20,face="bold")) } else{ P <- P + theme(legend.position="none", axis.text.x=element_text(size=20, angle=90, hjust=1), axis.text.y=element_blank(), axis.ticks.y=element_blank(), axis.title=element_blank(), strip.text.x=element_text(size=20,face="bold"), strip.text.y=element_text(size=20,face="bold")) } #WHY USING THE CODE BELOW TO CHANGE NUMBERS TO ASTERISKS I GET ERRORS? #P2 <- ggplot_build(P) #P2$data[[3]]$annotation <- rep(subset(pv.final0, p.value<=0.05)$map.signif, each=3) #P <- plot(ggplot_gtable(P2)) mylist[[sp]] <- list(num.signif, P) } return(mylist) } p.list <- plot.list(mydf, pv.final, addkw, a, myPal) y.rng <- range(mydf$value) # Get the highest number of significant p-values across all three "facets" height.factor <- 0.3 max.signif <- max(sapply(p.list, function(x) x[[1]])) # Lay out the three plots as facets (one for each Species), but adjust so that y-range is same for each facet. Top of y-range is adjusted using max_signif. png(filename="test.png", height=800, width=1200) grid.arrange(grobs=lapply(p.list, function(x) x[[2]] + scale_y_continuous(limits=c(y.rng[1], y.rng[2] + height.factor*max.signif))), ncol=length(unique(mydf$Species)), top="Random title", left="Value") #HOW TO CHANGE THE SIZE OF THE TITLE AND THE Y AXIS TEXT? #HOW TO ADD A COMMON LEGEND? dev.off() </code></pre> It produces the following plot: <img src="https://i.stack.imgur.com/0Nuyn.png" alt="test"> As you can see there are some problems, most obviously: 1- Coloring does not work for some reason 2- I do not seem to be able to change the annotation with the asterisks I want something more like this (mockup): <img src="https://i.stack.imgur.com/WgaHB.png" alt="test1"> So we need to: 1- Make coloring work 2- Show asterisks instead of numbers ...and for the win: 3- Make a common legend 4- Place Kruskal-Wallis line on top 5- Change the size (and alignment) of the title and y axis text IMPORTANT NOTES I would appreciate my code is left as intact as possible even if it isn't the prettiest, cause I still have to make use of intermediate objects like "CNb" or "pv.final". The solution should be easily transferable to other cases; please consider testing "variable" alone, instead of "both"... In this case we have 6 "facets" (vertically and horizontally) and everything gets even more screwed up... I made this other MWE: <pre class="prettyprint"><code>##NOW TEST MEASURE, TO GET VERTICAL AND HORIZONTAL FACETS addkw <- as.data.frame(mydf %>% group_by(treatment, Species) %>% summarize(p.value = kruskal.test(value ~ variable)$p.value)) #addkw$p.adjust <- p.adjust(addkw$p.value, "BH") a <- combn(levels(mydf$variable), 2, simplify = FALSE) #new p.values pv.final <- data.frame() for (tr in levels(mydf$treatment)){ for (gr in levels(mydf$Species)){ for (i in 1:length(a)){ tis <- a[[i]] #variable pair to test as <- subset(mydf, treatment==tr & Species==gr & variable %in% tis) pv <- wilcox.test(value ~ variable, data=as)$p.value ddd <- data.table(as) asm <- as.data.frame(ddd[, list(value=mean(value, na.rm=T)), by=list(variable=variable)]) asm2 <- dcast(asm, .~variable, value.var="value")[,-1] pf <- data.frame(group1=paste(tis[1], gr, tr), group2=paste(tis[2], gr, tr), mean.group1=asm2[,1], mean.group2=asm2[,2], FC.1over2=asm2[,1]/asm2[,2], p.value=pv) pv.final <- rbind(pv.final, pf) } } } #pv.final$p.adjust <- p.adjust(pv.final$p.value, method="BH") # set signif level pv.final$map.signif <- ifelse(pv.final$p.value > 0.05, "", ifelse(pv.final$p.value > 0.01,"*", "**")) plot.list2=function(mydf, pv.final, addkw, a, myPal){ mylist <- list() i <- 0 for (sp in unique(mydf$Species)){ for (tr in unique(mydf$treatment)){ i <- i+1 mydf0 <- subset(mydf, Species==sp & treatment==tr) addkw0 <- subset(addkw, Species==sp & treatment==tr) pv.final0 <- pv.final[grep(paste(sp,tr), pv.final$group1), ] num.signif <- sum(pv.final0$p.value <= 0.05) P <- ggplot(mydf0,aes(x=variable, y=value)) + geom_boxplot(aes(fill=Species)) + stat_summary(fun.y=mean, geom="point", shape=5, size=4) + facet_grid(treatment~Species, scales="free", space="free_x") + scale_fill_manual(values=myPal[i]) + #WHY IS COLOR IGNORED? geom_text(data=addkw0, hjust=0, size=4.5, aes(x=0, y=round(max(mydf0$value, na.rm=TRUE)+0.5), label=paste0("KW p=",p.value))) + geom_signif(test="wilcox.test", comparisons = a[which(pv.final0$p.value<=0.05)],#I can use "a"here map_signif_level = F, vjust=0, textsize=4, size=0.5, step_increase = 0.05) if (i==1){ P <- P + theme(legend.position="none", axis.text.x=element_blank(), axis.text.y=element_text(size=20), axis.title=element_blank(), axis.ticks.x=element_blank(), strip.text.x=element_text(size=20,face="bold"), strip.text.y=element_text(size=20,face="bold")) } if (i==4){ P <- P + theme(legend.position="none", axis.text.x=element_text(size=20, angle=90, hjust=1), axis.text.y=element_text(size=20), axis.title=element_blank(), strip.text.x=element_text(size=20,face="bold"), strip.text.y=element_text(size=20,face="bold")) } if ((i==2)|(i==3)){ P <- P + theme(legend.position="none", axis.text.x=element_blank(), axis.text.y=element_blank(), axis.title=element_blank(), axis.ticks.x=element_blank(), axis.ticks.y=element_blank(), strip.text.x=element_text(size=20,face="bold"), strip.text.y=element_text(size=20,face="bold")) } if ((i==5)|(i==6)){ P <- P + theme(legend.position="none", axis.text.x=element_text(size=20, angle=90, hjust=1), axis.text.y=element_blank(), #axis.ticks.y=element_blank(), #WHY SPECIFYING THIS GIVES ERROR? axis.title=element_blank(), axis.ticks.y=element_blank(), strip.text.x=element_text(size=20,face="bold"), strip.text.y=element_text(size=20,face="bold")) } #WHY USING THE CODE BELOW TO CHANGE NUMBERS TO ASTERISKS I GET ERRORS? #P2 <- ggplot_build(P) #P2$data[[3]]$annotation <- rep(subset(pv.final0, p.value<=0.05)$map.signif, each=3) #P <- plot(ggplot_gtable(P2)) sptr <- paste(sp,tr) mylist[[sptr]] <- list(num.signif, P) } } return(mylist) } p.list2 <- plot.list2(mydf, pv.final, addkw, a, myPal) y.rng <- range(mydf$value) # Get the highest number of significant p-values across all three "facets" height.factor <- 0.5 max.signif <- max(sapply(p.list2, function(x) x[[1]])) # Lay out the three plots as facets (one for each Species), but adjust so that y-range is same for each facet. Top of y-range is adjusted using max_signif. png(filename="test2.png", height=800, width=1200) grid.arrange(grobs=lapply(p.list2, function(x) x[[2]] + scale_y_continuous(limits=c(y.rng[1], y.rng[2] + height.factor*max.signif))), ncol=length(unique(mydf$Species)), top="Random title", left="Value") #HOW TO CHANGE THE SIZE OF THE TITLE AND THE Y AXIS TEXT? #HOW TO ADD A COMMON LEGEND? dev.off() </code></pre> That produces the following plot: <img src="https://i.stack.imgur.com/4u1C9.png" alt="test2"> Now the color problem becomes more striking, the facet heights are uneven, and something should be done with the redundant facet strip texts too. I am stuck at this point, so would appreciate any help. Sorry for the long question, but I think it is almost there! Thanks!!

You can try following. As your code is really busy and for me too complicated to understand, I suggest a different approach. I tried to avoid loops and to use the <code>tidyverse</code> as much as possible. Thus, first I created your data. Then calculated kruskal wallis tests as this was not possible within <code>ggsignif</code>. Afterwards I will plot all p.values using <code>geom_signif</code>. Finally, insignificant ones will be removed and a step increase is added. 1- Make coloring work done 2- Show asterisks instead of numbers done ...and for the win: 3- Make a common legend done 4- Place Kruskal-Wallis line on top done, I placed the values at the bottom 5- Change the size (and alignment) of the title and y axis text done <pre class="prettyprint"><code>library(tidyverse) library(ggsignif) # 1. your data set.seed(2) df <- as.tbl(iris) %>% mutate(treatment=rep(c("A","B"), length(iris$Species)/2)) %>% gather(key, value, -Species, -treatment) %>% mutate(value=rnorm(n())) %>% mutate(key=factor(key, levels=unique(key))) %>% mutate(both=interaction(treatment, key, sep = " ")) # 2. Kruskal test KW <- df %>% group_by(Species) %>% summarise(p=round(kruskal.test(value ~ both)$p.value,2), y=min(value), x=1) %>% mutate(y=min(y)) # 3. Plot P <- df %>% ggplot(aes(x=both, y=value)) + geom_boxplot(aes(fill=Species)) + facet_grid(~Species) + ylim(-3,7)+ theme(axis.text.x = element_text(angle=45, hjust=1)) + geom_signif(comparisons = combn(levels(df$both),2,simplify = F), map_signif_level = T) + stat_summary(fun.y=mean, geom="point", shape=5, size=4) + xlab("") + geom_text(data=KW,aes(x, y=y, label=paste0("KW p=",p)),hjust=0) + ggtitle("Plot") + ylab("This is my own y-lab") # 4. remove not significant values and add step increase P_new <- ggplot_build(P) P_new$data[[2]] <- P_new$data[[2]] %>% filter(annotation != "NS.") %>% group_by(PANEL) %>% mutate(index=(as.numeric(group[drop=T])-1)*0.5) %>% mutate(y=y+index, yend=yend+index) %>% select(-index) %>% as.data.frame() # the final plot plot(ggplot_gtable(P_new)) </code></pre> <img src="https://i.stack.imgur.com/m4qwd.png" alt="enter image description here"> and similar approach using two facets <pre class="prettyprint"><code># -------------------- # 5. Kruskal KW <- df %>% group_by(Species, treatment) %>% summarise(p=round(kruskal.test(value ~ both)$p.value,2), y=min(value), x=1) %>% ungroup() %>% mutate(y=min(y)) # 6. Plot with two facets P <- df %>% ggplot(aes(x=key, y=value)) + geom_boxplot(aes(fill=Species)) + facet_grid(treatment~Species) + ylim(-5,7)+ theme(axis.text.x = element_text(angle=45, hjust=1)) + geom_signif(comparisons = combn(levels(df$key),2,simplify = F), map_signif_level = T) + stat_summary(fun.y=mean, geom="point", shape=5, size=4) + xlab("") + geom_text(data=KW,aes(x, y=y, label=paste0("KW p=",p)),hjust=0) + ggtitle("Plot") + ylab("This is my own y-lab") # 7. remove not significant values and add step increase P_new <- ggplot_build(P) P_new$data[[2]] <- P_new$data[[2]] %>% filter(annotation != "NS.") %>% group_by(PANEL) %>% mutate(index=(as.numeric(group[drop=T])-1)*0.5) %>% mutate(y=y+index, yend=yend+index) %>% select(-index) %>% as.data.frame() # the final plot plot(ggplot_gtable(P_new)) </code></pre> <img src="https://i.stack.imgur.com/PY8WH.png" alt="enter image description here"> Edit. Regarding to your <code>p.adjust</code> needs, you can set up a function on your own and calling it directly within <code>geom_signif()</code>. <pre class="prettyprint"><code>wilcox.test.BH.adjusted <- function(x,y,n){ tmp <- wilcox.test(x,y) tmp$p.value <- p.adjust(tmp$p.value, n = n,method = "BH") tmp } geom_signif(comparisons = combn(levels(df$both),2,simplify = F), map_signif_level = T, test = "wilcox.test.BH.adjusted", test.args = list(n=8)) </code></pre> The challenge is to know how many independet tests you will have in the end. Then you can set the <code>n</code> by your own. Here I used <code>8</code>. But this is maybe wrong.

Boxplots with Wilcoxon significance levels, and facets, show only significant comparisons with asterisks

Tags:

r

testing

ggplot2

facet

boxplot

Following up on this question and for the sake of completeness, I modified the accepted answer and customized the resulting plot, but I am still facing some important problems.

To sum up, I am doing boxplots reflecting significance of Kruskal-Wallis and pairwise Wilcoxon test comparisons.

I want to replace the p-value numbers with asterisks, and show only the significant comparisons, reducing vertical spacing to the max.

Basically I want to do this, but with the added problem of facets, that messes everything up.

So far I have worked on a very decent MWE, but it still shows problems...

library(reshape2)
library(ggplot2)
library(gridExtra)
library(tidyverse)
library(data.table)
library(ggsignif)
library(RColorBrewer)

data(iris)
iris$treatment <- rep(c("A","B"), length(iris$Species)/2)
mydf <- melt(iris, measure.vars=names(iris)[1:4])
mydf$treatment <- as.factor(mydf$treatment)
mydf$variable <- factor(mydf$variable, levels=sort(levels(mydf$variable)))
mydf$both <- factor(paste(mydf$treatment, mydf$variable), levels=(unique(paste(mydf$treatment, mydf$variable))))

# Change data to reduce number of statistically significant differences
set.seed(2)
mydf <- mydf %>% mutate(value=rnorm(nrow(mydf)))
##

##FIRST TEST BOTH

#Kruskal-Wallis
addkw <- as.data.frame(mydf %>% group_by(Species) %>%
                       summarize(p.value = kruskal.test(value ~ both)$p.value))
#addkw$p.adjust <- p.adjust(addkw$p.value, "BH")
a <- combn(levels(mydf$both), 2, simplify = FALSE)
#new p.values
pv.final <- data.frame()
for (gr in unique(mydf$Species)){
    for (i in 1:length(a)){
        tis <- a[[i]] #variable pair to test
        as <- subset(mydf, Species==gr & both %in% tis)
        pv <- wilcox.test(value ~ both, data=as)$p.value
        ddd <- data.table(as)
        asm <- as.data.frame(ddd[, list(value=mean(value)), by=list(both=both)])
        asm2 <- dcast(asm, .~both, value.var="value")[,-1]
        pf <- data.frame(group1=paste(tis[1], gr), group2=paste(tis[2], gr), mean.group1=asm2[,1], mean.group2=asm2[,2], FC.1over2=asm2[,1]/asm2[,2], p.value=pv)
        pv.final <- rbind(pv.final, pf)
    }
}
#pv.final$p.adjust <- p.adjust(pv.final$p.value, method="BH")
pv.final$map.signif <- ifelse(pv.final$p.value > 0.05, "", ifelse(pv.final$p.value > 0.01,"*", "**"))

cols <- colorRampPalette(brewer.pal(length(unique(mydf$Species)), "Set1"))
myPal <- cols(length(unique(mydf$Species)))

#Function to get a list of plots to use as "facets" with grid.arrange
plot.list=function(mydf, pv.final, addkw, a, myPal){
    mylist <- list()
    i <- 0
    for (sp in unique(mydf$Species)){
        i <- i+1
        mydf0 <- subset(mydf, Species==sp)
        addkw0 <- subset(addkw, Species==sp)
        pv.final0 <- pv.final[grep(sp, pv.final$group1), ]
        num.signif <- sum(pv.final0$p.value <= 0.05)
        P <- ggplot(mydf0,aes(x=both, y=value)) +
            geom_boxplot(aes(fill=Species)) +
            stat_summary(fun.y=mean, geom="point", shape=5, size=4) +
            facet_grid(~Species, scales="free", space="free_x") +
            scale_fill_manual(values=myPal[i]) + #WHY IS COLOR IGNORED?
            geom_text(data=addkw0, hjust=0, size=4.5, aes(x=0, y=round(max(mydf0$value, na.rm=TRUE)+0.5), label=paste0("KW p=",p.value))) +
            geom_signif(test="wilcox.test", comparisons = a[which(pv.final0$p.value<=0.05)],#I can use "a"here
              map_signif_level = F,            
              vjust=0,
              textsize=4,
              size=0.5,
              step_increase = 0.05)
        if (i==1){
            P <- P + theme(legend.position="none",
                  axis.text.x=element_text(size=20, angle=90, hjust=1),
                  axis.text.y=element_text(size=20),
                  axis.title=element_blank(),
                  strip.text.x=element_text(size=20,face="bold"),
                  strip.text.y=element_text(size=20,face="bold"))
        } else{
            P <- P + theme(legend.position="none",
                  axis.text.x=element_text(size=20, angle=90, hjust=1),
                  axis.text.y=element_blank(),
                  axis.ticks.y=element_blank(),
                  axis.title=element_blank(),
                  strip.text.x=element_text(size=20,face="bold"),
                  strip.text.y=element_text(size=20,face="bold"))
        }
        #WHY USING THE CODE BELOW TO CHANGE NUMBERS TO ASTERISKS I GET ERRORS?
        #P2 <- ggplot_build(P)
        #P2$data[[3]]$annotation <- rep(subset(pv.final0, p.value<=0.05)$map.signif, each=3)
        #P <- plot(ggplot_gtable(P2))
        mylist[[sp]] <- list(num.signif, P)
    }
    return(mylist)
}
p.list <- plot.list(mydf, pv.final, addkw, a, myPal)
y.rng <- range(mydf$value)
# Get the highest number of significant p-values across all three "facets"
height.factor <- 0.3
max.signif <- max(sapply(p.list, function(x) x[[1]]))
# Lay out the three plots as facets (one for each Species), but adjust so that y-range is same for each facet. Top of y-range is adjusted using max_signif.
png(filename="test.png", height=800, width=1200)
grid.arrange(grobs=lapply(p.list, function(x) x[[2]] +
             scale_y_continuous(limits=c(y.rng[1], y.rng[2] + height.factor*max.signif))), 
             ncol=length(unique(mydf$Species)), top="Random title", left="Value") #HOW TO CHANGE THE SIZE OF THE TITLE AND THE Y AXIS TEXT?
             #HOW TO ADD A COMMON LEGEND?
dev.off()

It produces the following plot:

test

As you can see there are some problems, most obviously:

1- Coloring does not work for some reason

2- I do not seem to be able to change the annotation with the asterisks

I want something more like this (mockup):

test1

So we need to:

1- Make coloring work

2- Show asterisks instead of numbers

...and for the win:

3- Make a common legend

4- Place Kruskal-Wallis line on top

5- Change the size (and alignment) of the title and y axis text

IMPORTANT NOTES

I would appreciate my code is left as intact as possible even if it isn't the prettiest, cause I still have to make use of intermediate objects like "CNb" or "pv.final".

The solution should be easily transferable to other cases; please consider testing "variable" alone, instead of "both"... In this case we have 6 "facets" (vertically and horizontally) and everything gets even more screwed up...

I made this other MWE:

##NOW TEST MEASURE, TO GET VERTICAL AND HORIZONTAL FACETS

addkw <- as.data.frame(mydf %>% group_by(treatment, Species) %>%
                       summarize(p.value = kruskal.test(value ~ variable)$p.value))
#addkw$p.adjust <- p.adjust(addkw$p.value, "BH")
a <- combn(levels(mydf$variable), 2, simplify = FALSE)
#new p.values
pv.final <- data.frame()
for (tr in levels(mydf$treatment)){
    for (gr in levels(mydf$Species)){
        for (i in 1:length(a)){
            tis <- a[[i]] #variable pair to test
            as <- subset(mydf, treatment==tr & Species==gr & variable %in% tis)
            pv <- wilcox.test(value ~ variable, data=as)$p.value
            ddd <- data.table(as)
            asm <- as.data.frame(ddd[, list(value=mean(value, na.rm=T)), by=list(variable=variable)])
            asm2 <- dcast(asm, .~variable, value.var="value")[,-1]
            pf <- data.frame(group1=paste(tis[1], gr, tr), group2=paste(tis[2], gr, tr), mean.group1=asm2[,1], mean.group2=asm2[,2], FC.1over2=asm2[,1]/asm2[,2], p.value=pv)
            pv.final <- rbind(pv.final, pf)
        }
    }
}
#pv.final$p.adjust <- p.adjust(pv.final$p.value, method="BH")
# set signif level
pv.final$map.signif <- ifelse(pv.final$p.value > 0.05, "", ifelse(pv.final$p.value > 0.01,"*", "**"))
plot.list2=function(mydf, pv.final, addkw, a, myPal){
    mylist <- list()
    i <- 0
    for (sp in unique(mydf$Species)){
    for (tr in unique(mydf$treatment)){
        i <- i+1
        mydf0 <- subset(mydf, Species==sp & treatment==tr)
        addkw0 <- subset(addkw, Species==sp & treatment==tr)
        pv.final0 <- pv.final[grep(paste(sp,tr), pv.final$group1), ]
        num.signif <- sum(pv.final0$p.value <= 0.05)
        P <- ggplot(mydf0,aes(x=variable, y=value)) +
            geom_boxplot(aes(fill=Species)) +
            stat_summary(fun.y=mean, geom="point", shape=5, size=4) +
            facet_grid(treatment~Species, scales="free", space="free_x") +
            scale_fill_manual(values=myPal[i]) + #WHY IS COLOR IGNORED?
            geom_text(data=addkw0, hjust=0, size=4.5, aes(x=0, y=round(max(mydf0$value, na.rm=TRUE)+0.5), label=paste0("KW p=",p.value))) +
            geom_signif(test="wilcox.test", comparisons = a[which(pv.final0$p.value<=0.05)],#I can use "a"here
              map_signif_level = F,            
              vjust=0,
              textsize=4,
              size=0.5,
              step_increase = 0.05)
        if (i==1){
            P <- P + theme(legend.position="none",
                  axis.text.x=element_blank(),
                  axis.text.y=element_text(size=20),
                  axis.title=element_blank(),
                  axis.ticks.x=element_blank(),
                  strip.text.x=element_text(size=20,face="bold"),
                  strip.text.y=element_text(size=20,face="bold"))
        }
        if (i==4){
            P <- P + theme(legend.position="none",
                  axis.text.x=element_text(size=20, angle=90, hjust=1),
                  axis.text.y=element_text(size=20),
                  axis.title=element_blank(),
                  strip.text.x=element_text(size=20,face="bold"),
                  strip.text.y=element_text(size=20,face="bold"))
        }
        if ((i==2)|(i==3)){
            P <- P + theme(legend.position="none",
                  axis.text.x=element_blank(),
                  axis.text.y=element_blank(),
                  axis.title=element_blank(),
                  axis.ticks.x=element_blank(),
                  axis.ticks.y=element_blank(),
                  strip.text.x=element_text(size=20,face="bold"),
                  strip.text.y=element_text(size=20,face="bold"))
        }
        if ((i==5)|(i==6)){
            P <- P + theme(legend.position="none",
                  axis.text.x=element_text(size=20, angle=90, hjust=1),
                  axis.text.y=element_blank(),
                  #axis.ticks.y=element_blank(), #WHY SPECIFYING THIS GIVES ERROR?
                  axis.title=element_blank(),
                  axis.ticks.y=element_blank(),
                  strip.text.x=element_text(size=20,face="bold"),
                  strip.text.y=element_text(size=20,face="bold"))
        }
        #WHY USING THE CODE BELOW TO CHANGE NUMBERS TO ASTERISKS I GET ERRORS?
        #P2 <- ggplot_build(P)
        #P2$data[[3]]$annotation <- rep(subset(pv.final0, p.value<=0.05)$map.signif, each=3)
        #P <- plot(ggplot_gtable(P2))
        sptr <- paste(sp,tr)
        mylist[[sptr]] <- list(num.signif, P)
    }
    }
    return(mylist)
}
p.list2 <- plot.list2(mydf, pv.final, addkw, a, myPal)
y.rng <- range(mydf$value)
# Get the highest number of significant p-values across all three "facets"
height.factor <- 0.5
max.signif <- max(sapply(p.list2, function(x) x[[1]]))
# Lay out the three plots as facets (one for each Species), but adjust so that y-range is same for each facet. Top of y-range is adjusted using max_signif.
png(filename="test2.png", height=800, width=1200)
grid.arrange(grobs=lapply(p.list2, function(x) x[[2]] +
             scale_y_continuous(limits=c(y.rng[1], y.rng[2] + height.factor*max.signif))), 
             ncol=length(unique(mydf$Species)), top="Random title", left="Value") #HOW TO CHANGE THE SIZE OF THE TITLE AND THE Y AXIS TEXT?
             #HOW TO ADD A COMMON LEGEND?
dev.off()

That produces the following plot:

test2

Now the color problem becomes more striking, the facet heights are uneven, and something should be done with the redundant facet strip texts too.

I am stuck at this point, so would appreciate any help. Sorry for the long question, but I think it is almost there! Thanks!!

399

asked Oct 17 '17 05:10

DaniCee

2 Answers

You can try following. As your code is really busy and for me too complicated to understand, I suggest a different approach. I tried to avoid loops and to use the tidyverse as much as possible. Thus, first I created your data. Then calculated kruskal wallis tests as this was not possible within ggsignif. Afterwards I will plot all p.values using geom_signif. Finally, insignificant ones will be removed and a step increase is added.

1- Make coloring work done

2- Show asterisks instead of numbers done

...and for the win:

3- Make a common legend done

4- Place Kruskal-Wallis line on top done, I placed the values at the bottom

5- Change the size (and alignment) of the title and y axis text done

library(tidyverse)
library(ggsignif)

# 1. your data
set.seed(2)
df <- as.tbl(iris) %>% 
  mutate(treatment=rep(c("A","B"), length(iris$Species)/2)) %>% 
  gather(key, value, -Species, -treatment) %>% 
  mutate(value=rnorm(n())) %>% 
  mutate(key=factor(key, levels=unique(key))) %>% 
  mutate(both=interaction(treatment, key, sep = " "))

# 2. Kruskal test
KW <- df %>% 
  group_by(Species) %>%
  summarise(p=round(kruskal.test(value ~ both)$p.value,2),
            y=min(value),
            x=1) %>% 
  mutate(y=min(y))

# 3. Plot  
P <- df %>% 
ggplot(aes(x=both, y=value)) + 
  geom_boxplot(aes(fill=Species)) + 
  facet_grid(~Species) +
  ylim(-3,7)+
  theme(axis.text.x = element_text(angle=45, hjust=1)) +
  geom_signif(comparisons = combn(levels(df$both),2,simplify = F),
              map_signif_level = T) +
  stat_summary(fun.y=mean, geom="point", shape=5, size=4) +
  xlab("") +
  geom_text(data=KW,aes(x, y=y, label=paste0("KW p=",p)),hjust=0) +
  ggtitle("Plot") + ylab("This is my own y-lab")

# 4. remove not significant values and add step increase
P_new <- ggplot_build(P)
P_new$data[[2]] <- P_new$data[[2]] %>% 
  filter(annotation != "NS.") %>% 
  group_by(PANEL) %>%
  mutate(index=(as.numeric(group[drop=T])-1)*0.5) %>% 
  mutate(y=y+index,
         yend=yend+index) %>% 
  select(-index) %>% 
  as.data.frame()
# the final plot  
plot(ggplot_gtable(P_new))

enter image description here

and similar approach using two facets

# --------------------
# 5. Kruskal
KW <- df %>% 
  group_by(Species, treatment) %>%
  summarise(p=round(kruskal.test(value ~ both)$p.value,2),
            y=min(value),
            x=1) %>% 
  ungroup() %>% 
  mutate(y=min(y))


# 6. Plot with two facets  
P <- df %>% 
  ggplot(aes(x=key, y=value)) + 
  geom_boxplot(aes(fill=Species)) + 
  facet_grid(treatment~Species) +
  ylim(-5,7)+
  theme(axis.text.x = element_text(angle=45, hjust=1)) +
  geom_signif(comparisons = combn(levels(df$key),2,simplify = F),
              map_signif_level = T) +
  stat_summary(fun.y=mean, geom="point", shape=5, size=4) +
  xlab("") +
  geom_text(data=KW,aes(x, y=y, label=paste0("KW p=",p)),hjust=0) +
  ggtitle("Plot") + ylab("This is my own y-lab")

# 7. remove not significant values and add step increase
P_new <- ggplot_build(P)
P_new$data[[2]] <- P_new$data[[2]] %>% 
  filter(annotation != "NS.") %>% 
  group_by(PANEL) %>%
  mutate(index=(as.numeric(group[drop=T])-1)*0.5) %>% 
  mutate(y=y+index,
         yend=yend+index) %>% 
  select(-index) %>% 
  as.data.frame()
# the final plot  
plot(ggplot_gtable(P_new))

enter image description here

Edit.

Regarding to your p.adjust needs, you can set up a function on your own and calling it directly within geom_signif().

wilcox.test.BH.adjusted <- function(x,y,n){
  tmp <- wilcox.test(x,y)
  tmp$p.value <- p.adjust(tmp$p.value, n = n,method = "BH")
  tmp
}  

geom_signif(comparisons = combn(levels(df$both),2,simplify = F),
          map_signif_level = T, test = "wilcox.test.BH.adjusted", 
          test.args = list(n=8))

The challenge is to know how many independet tests you will have in the end. Then you can set the n by your own. Here I used 8. But this is maybe wrong.

132

answered Nov 04 '22 20:11

Roman

Constructing ggplots in a loop has always been known to produce confusing results, and for the explanation of point 1 I'll refer to this question and many others. There's also a hint there about evaluating the ggplot object on the spot, e.g. via print. Re point 2, you were close, a bit of debugging with trial and error helped. Here's the complete code for plot.list:

plot.list=function(mydf, pv.final, addkw, a, myPal){
    mylist <- list()
    i <- 0
    for (sp in unique(mydf$Species)){
        i <- i+1
        mydf0 <- subset(mydf, Species==sp)
        addkw0 <- subset(addkw, Species==sp)
        pv.final0 <- pv.final[grep(sp, pv.final$group1), ]
        num.signif <- sum(pv.final0$p.value <= 0.05)
        P <- ggplot(mydf0,aes(x=both, y=value)) +
            geom_boxplot(aes(fill=Species)) +
            stat_summary(fun.y=mean, geom="point", shape=5, size=4) +
            facet_grid(~Species, scales="free", space="free_x") +
            scale_fill_manual(values=myPal[i]) +
            geom_text(data=addkw0, hjust=0, size=4.5, aes(x=0, y=round(max(mydf0$value, na.rm=TRUE)+0.5), label=paste0("KW p=",p.value))) +
            geom_signif(test="wilcox.test", comparisons = a[which(pv.final0$p.value<=0.05)],#I can use "a"here
                        map_signif_level = F,            
                        vjust=0,
                        textsize=4,
                        size=0.5,
                        step_increase = 0.05)
        if (i==1){
            P <- P + theme(legend.position="none",
                           axis.text.x=element_text(size=20, angle=90, hjust=1),
                           axis.text.y=element_text(size=20),
                           axis.title=element_blank(),
                           strip.text.x=element_text(size=20,face="bold"),
                           strip.text.y=element_text(size=20,face="bold"))
        } else{
            P <- P + theme(legend.position="none",
                           axis.text.x=element_text(size=20, angle=90, hjust=1),
                           axis.text.y=element_blank(),
                           axis.ticks.y=element_blank(),
                           axis.title=element_blank(),
                           strip.text.x=element_text(size=20,face="bold"),
                           strip.text.y=element_text(size=20,face="bold"))
        }
        P2 <- ggplot_build(P)
        P2$data[[4]]$annotation <- rep(subset(pv.final0, p.value<=0.05)$map.signif, each=3)
        P <- ggplot_gtable(P2)
        mylist[[sp]] <- list(num.signif, P)
    }
    return(mylist)
}

Note that we can no longer modify the plot via ggplot semantics, since we already applied ggplot_build/ggplot_gtable, so scale modification is no longer possible. If you want to preserve it, move it inside the plot.list function. So, changing to

grid.arrange(grobs=lapply(p.list, function(x) x[[2]]), 
             ncol=length(unique(mydf$Species)), top="Random title", left="Value")

yields

enter image description here

That's not a complete solution, of course, but I hope that helps.

answered Nov 04 '22 21:11

tonytonov

Related questions
                            
                                Embed reactively generated URL in shiny
                            
                                How to draw an $\alpha$ confidence areas on a 2D-plot?
                            
                                Survival Plots with Survminer - Blank Plot
                            
                                ggplot date scales shifts one month forward
                            
                                In-line R expression returns incorrect value
                            
                                Shiny DownloadHandler doesn't update filename using Sys.time()
                            
                                Using lapply() in renderUI() in Shiny Module
                            
                                conditionally insert rows to a dataframe using dplyr
                            
                                Click on points in a leaflet map as input for a plot in shiny
                            
                                R Shiny synchronize filters on multiple tabs
                            
                                OSM, rgeos, osmar, area calculation, does not add up
                            
                                Normalizing y-axis in density plots in R ggplot to proportion by group
                            
                                How to avoid class name conflict with a package "loaded via a namespace (and not attached)" (qdap & openssl)
                            
                                Equivalent of `poly.counts` to count lat/long pairs falling inside of polygons with the sf package
                            
                                Bootstrapping a vector of results, by group in R
                            
                                Multiple rows in facet_grid
                            
                                Odd behaviour of data.table's update on non-equi self-join
                            
                                plotly - different colours for different surfaces
                            
                                How to include bash scripts in a package?
                            
                                how to add multiple line breaks conveniently in shiny?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Boxplots with Wilcoxon significance levels, and facets, show only significant comparisons with asterisks

Tags:

r

testing

ggplot2

facet

boxplot

DaniCee

People also ask

2 Answers

Roman

tonytonov

Recent Activity

Donate For Us