Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ggplot2: plot 2 variables (lines and points) and align 2 plots

Tags:

r

ggplot2

I have recently started using ggplot2 but I am finding a lot of difficulties... At this moment I just want to plot 2 different variables into one plot with points and lines (type=both in the plot function), and have this resulting plot placed and aligned above a histogram sharing the same x axis.

So I have this data.frame:

GO.df <- data.frame(GO.ID=paste("GO",c(1:29),sep=""),
                    occ=c(1:29),
                    pv=c(5.379594e-05, 3.052953e-03, 3.052953e-03, 3.052953e-03, 3.052953e-03, 3.052953e-03, 3.052953e-03, 3.052953e-03, 6.096906e-03, 6.096906e-03, 6.096906e-03, 6.096906e-03, 9.131884e-03, 9.131884e-03, 9.131884e-03, 9.131884e-03, 9.131884e-03, 9.131884e-03, 9.131884e-03, 1.215791e-02, 1.215791e-02, 1.215791e-02, 1.517502e-02, 1.517502e-02, 1.517502e-02, 1.517502e-02, 1.818323e-02, 1.818323e-02, 1.818323e-02),
                    adj.pv=c(0.004088492, 0.029003053, 0.029003053, 0.029003053, 0.029003053, 0.029003053, 0.029003053, 0.029003053, 0.036527537, 0.036527537, 0.036527537, 0.036527537, 0.036527537, 0.036527537, 0.036527537, 0.036527537, 0.036527537, 0.036527537, 0.036527537, 0.042000065, 0.042000065, 0.042000065, 0.044357749, 0.044357749, 0.044357749, 0.044357749, 0.047652596, 0.047652596, 0.047652596))

And want to reproduce this:

plot(GO.df$pv, type="b", col="red", ylim=c(0,0.05),ylab="",xlab="",xaxt="n")
lines(GO.df$adj.pv, type="b", col="blue")
axis(1, at=c(1:length(GO.df$GO.ID)), labels=GO.df$GO.ID, las=2)

Above a histogram (of variable "occ") and aligned with it. This is what I have so far with ggplot2:

#install.packages("ggplot2")
library(ggplot2)
#install.packages("reshape")
library(reshape)
#install.packages("gridExtra")
library(gridExtra)

GO.df2 <- melt(GO.df, measure.vars=c("pv", "adj.pv"))
p1 <- ggplot(GO.df2, aes(x=GO.ID, y=value, colour=variable)) + geom_point() + ylab("p-values") + xlab(NULL)
p2 <- ggplot(GO.df2, aes(x=GO.ID, y=occ)) + geom_bar(stat="identity") + ylab("Num of Ocurrences")
grid.arrange(
  p1, 
  p2,
  nrow = 2,
  main = textGrob("GO!", vjust = 1, gp=gpar(fontface = "bold", cex = 1.5)))

As you can see I am unable to:

1-plot both lines and points

2-have the data not scattered around, but ordered instead as it should be (the order is maintained with the plot function) in both plots.

3-have the two plots aligned with a minimal distance between them and no x axis in the one above.

4-have the plots aligned but still maintain the legend of the one above.

I hope you could help me with this, I'm still really new to ggplots2. Thanks a lot!

like image 390
DaniCee Avatar asked Feb 16 '23 23:02

DaniCee


1 Answers

I would probably not use grid.arrange, but rather do something more like this:

    dat <- rbind(GO.df2,GO.df2)
    dat$grp <- factor(rep(c('p-values','Num of Ocurrences'),each = nrow(GO.df2)),
                      levels = c('p-values','Num of Ocurrences'))
    dat$GO.ID <- factor(dat$GO.ID,levels = unique(dat$GO.ID))

ggplot(dat,aes(x = GO.ID)) + 
    facet_grid(grp~.,scales = "free_y") +
    geom_point(data = subset(dat,grp == 'p-values'),
               aes(y = value,colour = variable)) + 
    geom_line(data = subset(dat,grp == 'p-values'),
              aes(y = value,colour = variable,group = variable)) + 
    geom_bar(data = subset(dat,grp == 'Num of Ocurrences'),
             aes(y = occ),stat = "identity") + 
    theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
    ylab("")

enter image description here

Plotting the lines simply required adding geom_line, and making sure the grouping was set correctly.

Ordering the x axis, like everything else in ggplot, simply requires creating a factor and ordering the levels properly.

Aligning the plots is admittedly a bit trickier. It helps to try to massage faceting to do most of the aligning for you. To that end, I rbinded two copies of your data together, and created a grouping variable that will stand in as the different y axis labels.

Then we can use facet_grid to force the facet strips to be on the y axis, allow free y scales, and then only pass the appropriate subset of the data to each geom.

Thanks to agstudy, for reminding me to rotate the x axis labels using theme.

like image 66
joran Avatar answered Feb 27 '23 09:02

joran