Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Showing data values on stacked bar chart in ggplot2

People also ask

How do you make a stacked bar chart in R?

In order to create a stacked bar chart, also known as stacked bar graph or stacked bar plot, you can use barplot from base R graphics. Note that you can add a title, a subtitle, the axes labels with the corresponding arguments or remove the axes setting axes = FALSE , among other customization arguments.

How do I add a data label to a Barplot in R?

To add labels on top of each bar in Barplot in R we use the geom_text() function of the ggplot2 package. Parameters: value: value field of which labels have to display. nudge_y: distance shift in the vertical direction for the label.


From ggplot 2.2.0 labels can easily be stacked by using position = position_stack(vjust = 0.5) in geom_text.

ggplot(Data, aes(x = Year, y = Frequency, fill = Category, label = Frequency)) +
  geom_bar(stat = "identity") +
  geom_text(size = 3, position = position_stack(vjust = 0.5))

enter image description here

Also note that "position_stack() and position_fill() now stack values in the reverse order of the grouping, which makes the default stack order match the legend."


Answer valid for older versions of ggplot:

Here is one approach, which calculates the midpoints of the bars.

library(ggplot2)
library(plyr)

# calculate midpoints of bars (simplified using comment by @DWin)
Data <- ddply(Data, .(Year), 
   transform, pos = cumsum(Frequency) - (0.5 * Frequency)
)

# library(dplyr) ## If using dplyr... 
# Data <- group_by(Data,Year) %>%
#    mutate(pos = cumsum(Frequency) - (0.5 * Frequency))

# plot bars and add text
p <- ggplot(Data, aes(x = Year, y = Frequency)) +
     geom_bar(aes(fill = Category), stat="identity") +
     geom_text(aes(label = Frequency, y = pos), size = 3)

Resultant chart


As hadley mentioned there are more effective ways of communicating your message than labels in stacked bar charts. In fact, stacked charts aren't very effective as the bars (each Category) doesn't share an axis so comparison is hard.

It's almost always better to use two graphs in these instances, sharing a common axis. In your example I'm assuming that you want to show overall total and then the proportions each Category contributed in a given year.

library(grid)
library(gridExtra)
library(plyr)

# create a new column with proportions
prop <- function(x) x/sum(x)
Data <- ddply(Data,"Year",transform,Share=prop(Frequency))

# create the component graphics
totals <- ggplot(Data,aes(Year,Frequency)) + geom_bar(fill="darkseagreen",stat="identity") + 
  xlab("") + labs(title = "Frequency totals in given Year")
proportion <- ggplot(Data, aes(x=Year,y=Share, group=Category, colour=Category)) 
+ geom_line() + scale_y_continuous(label=percent_format())+ theme(legend.position = "bottom") + 
  labs(title = "Proportion of total Frequency accounted by each Category in given Year")

# bring them together
grid.arrange(totals,proportion)

This will give you a 2 panel display like this:

Vertically stacked 2 panel graphic

If you want to add Frequency values a table is the best format.