Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to plot a Stacked and grouped bar chart in ggplot?

I have a data frame like below:

id    month     type    count
___  _______   ______   ______
1      1          1       10
1      1          2       09
1      1          3       26
1      2          1       60
1      2          2       90
2      2          3       80
2      1          1       10
2      1          2       09
2      1          3       26
2      2          1       60
2      2          2       90
2      2          3       80
3      1          1       10
3      1          2       09
3      1          3       26
3      2          1       60
3      2          2       90
3      2          3       80

I thought the best way to visualize is a stacked group bar something like the below: Stacked and Grouped Bar Chart

So I tried with

ggplot(df,aes(x=id,y=count,fill=month))+geom_bar(stat="identity",position=position_dodge())+geom_text(aes(label=count),size=3)

Which gave a plot which was a bit different than my expectation.Any help is appreciated.

like image 731
Ricky Avatar asked Oct 06 '17 02:10

Ricky


People also ask

How do you plot a stacked bar chart in Seaborn?

A stacked Bar plot is a kind of bar graph in which each bar is visually divided into sub bars to represent multiple column data at once. To plot the Stacked Bar plot we need to specify stacked=True in the plot method. We can also pass the list of colors as we needed to color each sub bar in a bar.


2 Answers

Suppose you want to plot id as x-axis, side by side for the month, and stack different types, you can split data frame by month, and add a bar layer for each month, shift the x by an amount for the second month bars so they can be separated:

barwidth = 0.35

month_one <- filter(df, month == 1) %>% 
    group_by(id) %>% arrange(-type) %>% 
    mutate(pos = cumsum(count) - count / 2)   # calculate the position of the label

month_two <- filter(df, month == 2) %>% 
    group_by(id) %>% arrange(-type) %>% 
    mutate(pos = cumsum(count) - count / 2)

ggplot() + 
    geom_bar(data = month_one, 
             mapping = aes(x = id, y = count, fill = as.factor(type)), 
             stat="identity", 
             position='stack', 
             width = barwidth) + 
    geom_text(data = month_one, 
              aes(x = id, y = pos, label = count )) + 
    geom_bar(data = filter(df, month==2), 
             mapping = aes(x = id + barwidth + 0.01, y = count, fill = as.factor(type)), 
             stat="identity", 
             position='stack' , 
             width = barwidth) + 
    geom_text(data = month_two, 
              aes(x = id + barwidth + 0.01, y = pos, label = count )) + 
    labs(fill  = "type")

gives:

enter image description here


dput(df)
structure(list(id = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L), month = c(1L, 1L, 1L, 2L, 2L, 
2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L), type = c(1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L), count = c(10L, 9L, 26L, 60L, 90L, 80L, 10L, 9L, 26L, 60L, 
90L, 80L, 10L, 9L, 26L, 60L, 90L, 80L)), .Names = c("id", "month", 
"type", "count"), class = "data.frame", row.names = c(NA, -18L
))
like image 154
Psidom Avatar answered Oct 05 '22 13:10

Psidom


This problem can be solved much more cleanly with facet_grid:

library(tidyverse)
read_tsv("tmp.tsv", col_types = "ccci") %>%  
ggplot(aes(x=month, y=count, fill=type)) + geom_col() + facet_grid(.~id)

stacked bars side-by-side

Note that you have to specify the first three columns as "character" in the col_types argument otherwise it won't look so good. It would be even better to replace the numeric codes with something meaningful (e.g. make the months into ordered factors "January", "February" instead of 1, 2; something similar for type and id).

like image 21
Michael Dunn Avatar answered Oct 05 '22 15:10

Michael Dunn