Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating a histogram with multiple data series using multhist in R

I want to create a histogram with multiple data series on the same plot. The best method that I can find to do this is multhist(). I would like a plot in a style similar to hist(), and while ggplot() can also be used to perform this task, the graphics style is not what I want.

Here is some example data:

df <- structure(list(year = c(2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 
2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 
2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2012L, 
2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 
2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 
2012L, 2012L, 2012L), count = c(187L, 199L, 560L, 1000L, 850L, 
400L, 534L, 911L, 390L, 1008L, 1173L, 1222L, 810L, 950L, 752L, 
1125L, 468L, 710L, 290L, 670L, 855L, 614L, 1300L, 950L, 670L, 
888L, 490L, 557L, 741L, 700L, 954L, 378L, 512L, 780L, 951L, 398L, 
1544L, 903L, 769L, 1399L, 1021L, 1235L, 1009L, 1222L, 255L)), .Names = c("year", 
"count"), class = "data.frame", row.names = c(NA, -45L))

And here is the code that I have used so far:

require(plotrix)
d2011<-df$count[df$year=="2011"]
d2012<-df$count[df$year=="2012"]
year<-list(d2011,d2012)
mh <- multhist(year, xlab="Count", ylab="Frequency", main="", cex.axis=1, col=c("dark gray", "light gray"), breaks=seq(0,1600, by=200))
box(bty="l", col="black")
legend.text<-c("2011","2012")
legend(locator(1), legend=legend.text, col=c("dark gray", "light gray"), pch=15, bty="n", cex=0.8)

This provides me with a 'barplot style' multi histogram, but I am having issues changing two graph parameters.

  1. I would like the plot to look more like a histogram and less like a barplot, so firstly I want to remove (or reduce) the space between the columns. I have tried using space = NULL, but this command does not appear to work with multhist

  2. I would like to change the x-axis so that axis tick marks are present in between bars on the plot and axis text is aligned with tick marks rather than positioned at the bar midpoint. I have tried using axis(side=1, …), but as multhist uses list objects to create plots these commands don’t appear to work.

Any suggestions would be greatly appreciated. Suggestions for other useful graphics packages that can plot histograms with multiple datasets would also be welcomed.

like image 885
Emily Avatar asked Feb 15 '23 18:02

Emily


1 Answers

Read the documentation of barplot to understand how to specify zero space:

multhist(year, xlab="Count", ylab="Frequency", main="", 
         cex.axis=1, col=c("dark gray", "light gray"), 
         breaks=seq(0,1600, by=200),
         space=c(0,0), beside=TRUE)

enter image description here

Here is an example with ggplot2 and theme_bw:

library(ggplot2)

ggplot(df, aes(x=count,group=year,fill=as.factor(year))) + 
  geom_histogram(position="identity", alpha=0.5, breaks=seq(0,1600, by=200),right=TRUE) +
  scale_fill_discrete(name="Year") +
  theme_bw(base_size=20) +
  xlab("values")

enter image description here

Or if you really want it like the plot from multhist (which is not as easy to interpret):

ggplot(df, aes(x=count,group=year,fill=as.factor(year))) + 
  geom_histogram(position="dodge", breaks=seq(0,1600, by=200),right=TRUE) +
  scale_fill_discrete(name="Year") +
  theme_bw(base_size=20) +
  xlab("values") +
  scale_x_continuous(breaks=seq(100,1500, by=200))

enter image description here

like image 58
Roland Avatar answered Feb 18 '23 10:02

Roland