I'm trying to generate many individual plots, each plotting the level of one single gene (=column) for each cell (=row). In the code I also have two subsets of "cell", based on whether each cell has a value of >0 for gene1 or not (this is handled with dplyr).
My attempt below plots the values for all genes/columns in one single pdf plot, once. Any advice on how I can alter my code to generate one plot per gene/column?
Dataset:
gene1 gene2 gene3 gene4 gene5
cell_1 0.0000 0.279204 25.995400 46.171700 94.234100
cell_2 0.0000 23.456000 77.339800 194.241000 301.234000
cell_3 2.0000 13.100000 45.309200 0.776565 0.000000
cell_4 0.0000 10.500000 107.508000 3.032500 0.000000
cell_5 3.0000 0.000000 0.266139 0.762981 123.371000
Code:
library(ggplot2)
library(dplyr)
library(tidyr)
#Loop making many single box plots
df3 <- df2 %>% as.data.frame %>% mutate(Cell= rownames(.), positive = df2$gene1>0) %>%
gather(., key= gene, value="value", -Cell,-positive) %>%
mutate( absolute= abs(value), logabs= log(absolute+1))
for (i in unique(df3$gene)) {
geneplot <- df3 %>% ggplot(., aes(x=gene, y=logabs, fill=positive)) +
geom_boxplot() +
xlab("Gene") + ylab("Expression level (TPM log)") +
theme_classic(base_size = 14, base_family = "Helvetica") +
theme(axis.text.y=element_text(size=14)) +
theme(axis.title.y=element_text(size=14, face="bold")) +
theme(axis.text.x=element_text(size=14)) +
theme(axis.title.x=element_text(size=14, face="bold")) +
scale_fill_brewer(palette="Pastel1")
print(geneplot)
ggsave(sprintf("%s.png", df3$gene))
dev.off()
}
gene1<-c(0.0000, 0.0000, 2.0000, 0.0000, 3.0000)
gene2<-c(0.279204, 23.456000, 13.100000 , 10.500000, 3.0000)
gene3<-c(25.995400, 77.339800, 45.309200, 107.508000, 0.266139)
gene4<-c(46.171700, 194.241000, 0.776565, 3.032500, 0.762981)
gene5<-c(94.234100, 301.234000, 0.000000, 0.000000, 3.0000)
df<-data.frame(gene1, gene2, gene3,gene4,gene5)
df <- df %>%
as.data.frame %>%
mutate(Cell= rownames(.), positive = df$gene1>0) %>%
gather(., key= gene, value="value", -Cell,-positive) %>%
mutate( absolute= abs(value), logabs= log(absolute+1))
ggplot(data= df , aes(x=gene, y=logabs, fill=positive))+
geom_boxplot()+facet_wrap(~ gene)

UPDATE
I'm not exactly sure what the poster is asking but here are a couple interpretations:
The actual data has additional genes with values of one, therefore, using facet_wrap(~ gene) creates an additional unnecessary plot like in the following:
gene1<-c(0.0000, 0.0000, 2.0000, 0.0000, 3.0000)
gene2<-c(0.279204, 23.456000, 13.100000 , 10.500000, 3.0000)
gene3<-c(25.995400, 77.339800, 45.309200, 107.508000, 0.266139)
gene4<-c(46.171700, 194.241000, 0.776565, 3.032500, 0.762981)
gene5<-c(94.234100, 301.234000, 0.000000, 0.000000, 3.0000)
gene6<-c(0.0000, 0.0000, 0.0000, 0.0000, 0.0000)
df<-data.frame(gene1, gene2, gene3,gene4,gene5, gene6)
df <- df %>%
as.data.frame %>%
mutate(Cell= rownames(.), positive = df$gene1>0) %>%
gather(., key= gene, value="value", -Cell,-positive) %>%
mutate( absolute= abs(value), logabs= log(absolute+1))
ggplot(data= df , aes(x=gene, y=logabs, fill=positive))+
geom_boxplot()+facet_wrap(~ gene)

To avoid that, simply simply run
df<-filter(df, value>0)
ggplot(data= df , aes(x=gene, y=logabs, fill=positive))+
geom_boxplot()+facet_wrap(~ gene)
To get:

If that's not your concern my apologies. Perhaps it's that you want to get rid of the breaks for individual genes with no values, like in the following as @Huub Hoofs pointed out. To accomplish that, as Huub Hoofs suggest, try the following:
ggplot(data= df , aes(x=gene, y=logabs, fill=positive))+
geom_boxplot()+facet_grid(~ gene, scales = "free")

OR
ggplot(data= df , aes(x=gene, y=logabs, fill=positive))+
geom_boxplot(aes(1))+facet_wrap(~ gene)

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With