Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to plot a (sophisticated) stacked barplot in ggplot2, without complicated manual data aggregation

Tags:

r

ggplot2

I want to plot a (facetted) stacked barplot where the X-Axis is in percent. Also the Frequency labels are displayed within the bars.

After quite some work and viewing many different questions on stackoverflow, I found a solution on how to solve this with ggplot2. However, I don't do it directly with ggplot2, I manually aggregate my data with a table call. And I do this manual aggregation in a complicated way and also calculate the percent values manually with temp variables (see source code comment "manually aggregate data").

How can I do the same plot, but in a nicer way without the manual and complicated data aggregation?

library(ggplot2)
library(scales)

library(gridExtra)
library(plyr)

##
##  Random Data
##
fact1 <- factor(floor(runif(1000, 1,6)),
                      labels = c("A","B", "C", "D", "E"))

fact2 <- factor(floor(runif(1000, 1,6)),
                      labels = c("g1","g2", "g3", "g4", "g5"))

##
##  STACKED BAR PLOT that scales x-axis to 100%
##

## manually aggregate data
##
mytable <- as.data.frame(table(fact1, fact2))

colnames(mytable) <- c("caseStudyID", "Group", "Freq")

mytable$total <- sapply(mytable$caseStudyID,
                        function(caseID) sum(subset(mytable, caseStudyID == caseID)$Freq))

mytable$percent <- round((mytable$Freq/mytable$total)*100,2)

mytable2 <- ddply(mytable, .(caseStudyID), transform, pos = cumsum(percent) - 0.5*percent)


## all case studies in one plot (SCALED TO 100%)

p1 <- ggplot(mytable2, aes(x=caseStudyID, y=percent, fill=Group)) +
    geom_bar(stat="identity") +
    theme(legend.key.size = unit(0.4, "cm")) +
    theme(axis.text.x = element_text(angle = 60, hjust = 1)) +
    geom_text(aes(label = sapply(Freq, function(x) ifelse(x>0, x, NA)), y = pos), size = 3) # the ifelse guards against printing labels with "0" within a bar


print(p1)

.. enter image description here

like image 485
mrsteve Avatar asked Dec 02 '25 17:12

mrsteve


1 Answers

After you make the data:

fact1 <- factor(floor(runif(1000, 1,6)),
                  labels = c("A","B", "C", "D", "E"))

fact2 <- factor(floor(runif(1000, 1,6)),
                  labels = c("g1","g2", "g3", "g4", "g5"))

dat = data.frame(caseStudyID=fact1, Group=fact2)

You can automate making an unlabeled graph of the kind that you want with position_fill:

ggplot(dat, aes(caseStudyID, fill=Group)) + geom_bar(position="fill")

unlabeled graph

I don't know if there's a way to generate the text labels automatically. The positions and counts from the stacked graph are accessible with ggplot_build, if you want to use what ggplot calculates instead of doing it separately.

p = ggplot(dat, aes(caseStudyID, fill=Group)) + geom_bar(position="fill")
ggplot_build(p)$data[[1]]

That will return a dataframe with (among other things), count, x, y, ymin, and ymax variables that can be used to create positioned labels.

If you want the labels vertically centered in each category, first make a column with values halfway between ymin and ymax.

freq = ggplot_build(p)$data[[1]]
freq$y_pos = (freq$ymin + freq$ymax) / 2

Then add the labels to the graph with annotate.

p + annotate(x=freq$x, y=freq$y_pos, label=freq$count, geom="text", size=3)

labeled

like image 173
user2034412 Avatar answered Dec 05 '25 07:12

user2034412



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!