Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Barplot with 2 variables side by side

Tags:

plot

r

ggplot2

I am trying to get a barplot which is fairly common, but despite reading tons of documentation on plotting in R, and the documentation of ggplot and all its layers, I cannot get this plot the way I want it to.

My data is fairly simple.

aov.data
   Block    RTreg    RTrnd
1 Block1   0.0000 862.0707 
2 Block2 667.2081 770.4315 
3 Block3 645.4730 696.0200
4 Block4 674.5200 659.4765
5 Block5 651.4295 633.7333

What I want to get is a barplot with the Block column in the x axis, working as a categorical variable. In the y axis I want to have the values to plot the RTreg and RTrnd columns. This is the part I cannot get right. What I would like is t have two bars per tick mark in the x axis. One bar representing the value of RTreg in that Block, and one bar representing the value of RTrnd in that block. Something like this:

Valid XHTML.

But with 5 sets of two bars instead of two.

So far, I have used the following code:

ggplot(aov.data,aes(x=Block)) + 
          geom_bar(aes(y=RTreg),stat="identity",position="dodge",col="blue") +
          geom_bar(position="dodge") +   
          geom_bar(aes(y=RTrnd),stat="identity",position="dodge",col="red")

What I thought I was doing was to first create the base plot with the categorical x axis containing the blocks. Then, with geom_bar I thought I was adding, first, the RTreg column, and with the second geom_bar, that I was adding the RTrnd column. With the option position being equal to dodge I though I was going to get the two bars, side by side. However, the plot I am getting is:

Valid XHTML .

Any idea on how to get the two bars side by side, and hopefully with different colors and a legend for each? I would really appreciate some guidance here.

Thanks in advance.

like image 770
HernanLG Avatar asked Mar 01 '13 12:03

HernanLG


People also ask

How do I plot two bar graphs side by side in Python?

By using plt. subplot() method we create two subplots side by side. plt. bar() method is used to create multiple bar chart graphs.

How do I create a side by side barplot in R?

If we want to create side by side barplot for two vectors or two columns of same or different data frames then we first need to combine those vectors or columns with the help of cbind function as shown below. After that barplot function will be applied to the combined data and beside argument will be set to TRUE.


2 Answers

You have to reshape your data frame from wide to long and then you don't need to set bars for each condition separately.

Assuming that data frame is named df.

library(reshape2)
df.long<-melt(df)
ggplot(df.long,aes(Block,value,fill=variable))+
     geom_bar(stat="identity",position="dodge")

enter image description here

like image 82
Didzis Elferts Avatar answered Oct 15 '22 23:10

Didzis Elferts


An alternative approach using the native R functions:

#Taking the Dataset mentioned in the question
df = data.frame(Block = c("Block 1", "Block 2", "Block 3", "Block 4", "Block 5"), RTreg = c(0.0,667.208,645.47,674.52,651.42), RTrnd = c(862.07,770.43,696.02,659.476,633.733))

#Create a Matrix which will help in creating the plot
value_matrix = matrix(, nrow = 2, ncol = 5)
#An empty matrix is a necessary requirement prior to copying data
value_matrix[1,] = df$RTreg 
value_matrix[2,] = df$RTrnd

#Note that the "beside" argument has to be kept "TRUE" in order to place the bars side by side
barplot(value_matrix, names.arg = df$Block, beside = TRUE, col = c("peachpuff", "skyblue"), legend.text = c("RTreg", "RTrnd"))

What happens above?

To answer that, I would first like to bring your attention to the first argument of the barplot() i.e. 'height'. Now, here if you provide a vector as an input, then the function will generate a regular bar graph, like you would expect it to. However, in order to plot a clustered bar graph, you'll have to provide data in the matrix format. Now, here comes in the magic. For the clustered bar graph, your variables need to be present in the rows of the matrix, instead of the regular columns. What I mean to say is, instead of storing data along the columns:

   RTreg     RTrnd
1  0.0000    862.0707 
2  667.2081  770.4315 
3  645.4730  696.0200

store the data across the rows:

       1        2        3
RTreg  0.0000   667.2081 645.4730
RTrnd  862.0707 770.4315 696.0200

This is the way barplot() recognises the 2 different variables 'RTreg' & 'RTrnd'. If you have m other variables that you wish to put into the cluster, then simply copy them across the rows of the matrix after setting the 'nrow' argument of the matrix() to m. Now, to achieve the above structure, I simply created an empty matrix and copied the 'RTreg' & 'RTrnd' in row 1 & 2 respectively. Once the matrix is created, then you are good to go. Simply call the barplot() function along with your matrix as the height argument and don't forget to set 'beside' argument to TRUE.

Result of the above code :

enter image description here

like image 29
Argon Avatar answered Oct 15 '22 21:10

Argon