I have recently started using ggplot2 but I am finding a lot of difficulties... At this moment I just want to plot 2 different variables into one plot with points and lines (type=both in the plot function), and have this resulting plot placed and aligned above a histogram sharing the same x axis.
So I have this data.frame:
GO.df <- data.frame(GO.ID=paste("GO",c(1:29),sep=""),
occ=c(1:29),
pv=c(5.379594e-05, 3.052953e-03, 3.052953e-03, 3.052953e-03, 3.052953e-03, 3.052953e-03, 3.052953e-03, 3.052953e-03, 6.096906e-03, 6.096906e-03, 6.096906e-03, 6.096906e-03, 9.131884e-03, 9.131884e-03, 9.131884e-03, 9.131884e-03, 9.131884e-03, 9.131884e-03, 9.131884e-03, 1.215791e-02, 1.215791e-02, 1.215791e-02, 1.517502e-02, 1.517502e-02, 1.517502e-02, 1.517502e-02, 1.818323e-02, 1.818323e-02, 1.818323e-02),
adj.pv=c(0.004088492, 0.029003053, 0.029003053, 0.029003053, 0.029003053, 0.029003053, 0.029003053, 0.029003053, 0.036527537, 0.036527537, 0.036527537, 0.036527537, 0.036527537, 0.036527537, 0.036527537, 0.036527537, 0.036527537, 0.036527537, 0.036527537, 0.042000065, 0.042000065, 0.042000065, 0.044357749, 0.044357749, 0.044357749, 0.044357749, 0.047652596, 0.047652596, 0.047652596))
And want to reproduce this:
plot(GO.df$pv, type="b", col="red", ylim=c(0,0.05),ylab="",xlab="",xaxt="n")
lines(GO.df$adj.pv, type="b", col="blue")
axis(1, at=c(1:length(GO.df$GO.ID)), labels=GO.df$GO.ID, las=2)
Above a histogram (of variable "occ") and aligned with it. This is what I have so far with ggplot2:
#install.packages("ggplot2")
library(ggplot2)
#install.packages("reshape")
library(reshape)
#install.packages("gridExtra")
library(gridExtra)
GO.df2 <- melt(GO.df, measure.vars=c("pv", "adj.pv"))
p1 <- ggplot(GO.df2, aes(x=GO.ID, y=value, colour=variable)) + geom_point() + ylab("p-values") + xlab(NULL)
p2 <- ggplot(GO.df2, aes(x=GO.ID, y=occ)) + geom_bar(stat="identity") + ylab("Num of Ocurrences")
grid.arrange(
p1,
p2,
nrow = 2,
main = textGrob("GO!", vjust = 1, gp=gpar(fontface = "bold", cex = 1.5)))
As you can see I am unable to:
1-plot both lines and points
2-have the data not scattered around, but ordered instead as it should be (the order is maintained with the plot function) in both plots.
3-have the two plots aligned with a minimal distance between them and no x axis in the one above.
4-have the plots aligned but still maintain the legend of the one above.
I hope you could help me with this, I'm still really new to ggplots2. Thanks a lot!
I would probably not use grid.arrange
, but rather do something more like this:
dat <- rbind(GO.df2,GO.df2)
dat$grp <- factor(rep(c('p-values','Num of Ocurrences'),each = nrow(GO.df2)),
levels = c('p-values','Num of Ocurrences'))
dat$GO.ID <- factor(dat$GO.ID,levels = unique(dat$GO.ID))
ggplot(dat,aes(x = GO.ID)) +
facet_grid(grp~.,scales = "free_y") +
geom_point(data = subset(dat,grp == 'p-values'),
aes(y = value,colour = variable)) +
geom_line(data = subset(dat,grp == 'p-values'),
aes(y = value,colour = variable,group = variable)) +
geom_bar(data = subset(dat,grp == 'Num of Ocurrences'),
aes(y = occ),stat = "identity") +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
ylab("")
Plotting the lines simply required adding geom_line
, and making sure the grouping was set correctly.
Ordering the x axis, like everything else in ggplot, simply requires creating a factor and ordering the levels properly.
Aligning the plots is admittedly a bit trickier. It helps to try to massage faceting to do most of the aligning for you. To that end, I rbind
ed two copies of your data together, and created a grouping variable that will stand in as the different y axis labels.
Then we can use facet_grid
to force the facet strips to be on the y axis, allow free y scales, and then only pass the appropriate subset of the data to each geom.
Thanks to agstudy, for reminding me to rotate the x axis labels using theme
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With