Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a stacked bar chart from summarized data in ggplot2

I'm trying to create a stacked bar graph using ggplot 2. My data in its wide form, looks like this. The numbers in each cell are the frequency of responses.

activity                         yes    no  dontknow
Social events                     27    3   3
Academic skills workshops         23    5   8
Summer research                   22    7   7
Research fellowship               20    6   9
Travel grants                     18    8   7
Resume preparation                17    4   12
RAs                               14    11  8
Faculty preparation               13    8   11
Job interview skills              11    9   12
Preparation of manuscripts        10    8   14
Courses in other campuses          5    11  15
Teaching fellowships               4    14  16
TAs                                3    15  15
Access to labs in other campuses   3    11  18
Interdisciplinary research         2    11  18
Interdepartamental projects        1    12  19

I melted this table using reshape2 and

 melted.data(wide.data,id.vars=c("activity"),measure.vars=c("yes","no","dontknow"),variable.name="haveused",value.name="responses")

That's as far as I can get. I want to create a stacked bar chart with activities on the x axis, frequency of responses in the y axis, and each bar showing the distribution of the yes, nos and dontknows

I've tried

ggplot(melted.data,aes(x=activity,y=responses))+geom_bar(aes(fill=haveused))

but I'm afraid that's not the right solution

Any help is much appreciated.

like image 248
Bartolome Salom Avatar asked Aug 11 '12 22:08

Bartolome Salom


1 Answers

You haven't said what it is that's not right about your solution. But some issues that could be construed as problems, and one possible solution for each, are:

  • The x axis tick mark labels run into each other. SOLUTION - rotate the tick mark labels;
  • The order in which the labels (and their corresponding bars) appear are not the same as the order in the original dataframe. SOLUTION - reorder the levels of the factor 'activity';
  • To position text inside the bars set the vjust parameter in position_stack to 0.5

The following might be a start.

    # Load required packages
library(ggplot2)
library(reshape2)

    # Read in data
df = read.table(text = "
activity                         yes    no  dontknow
Social.events                     27    3   3
Academic.skills.workshops         23    5   8
Summer.research                   22    7   7
Research.fellowship               20    6   9
Travel.grants                     18    8   7
Resume.preparation                17    4   12
RAs                               14    11  8
Faculty.preparation               13    8   11
Job.interview.skills              11    9   12
Preparation.of.manuscripts        10    8   14
Courses.in.other.campuses          5    11  15
Teaching.fellowships               4    14  16
TAs                                3    15  15
Access.to.labs.in.other.campuses   3    11  18
Interdisciplinay.research          2    11  18
Interdepartamental.projects        1    12  19", header = TRUE, sep = "")

    # Melt the data frame
dfm = melt(df, id.vars=c("activity"), measure.vars=c("yes","no","dontknow"),
    variable.name="haveused", value.name="responses")

    # Reorder the levels of activity
dfm$activity = factor(dfm$activity, levels = df$activity)

    # Draw the plot
ggplot(dfm, aes(x = activity, y = responses, group = haveused)) + 
geom_col(aes(fill=haveused)) +
theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.25)) +
geom_text(aes(label = responses), position = position_stack(vjust = .5), size = 3)  # labels inside the bar segments
like image 151
Sandy Muspratt Avatar answered Sep 25 '22 17:09

Sandy Muspratt