How can I create a (100%) stacked histogram in R?

My dataset:

I have data in the following format (here, imported from a CSV file). You can find an example dataset as CSV here.

PAIR   PREFERENCE
1      5
1      3
1      2
2      4
2      1
2      3

… and so on. In total, there are 19 pairs, and the PREFERENCE ranges from 1 to 5, as discrete values.

What I'm trying to achieve:

What I need is a stacked histogram, e.g. a 100% high column, for each pair, indicating the distribution of the PREFERENCE values.

Something similar to the "100% stacked columns" in Excel, or (although not quite the same, a so-called "mosaic plot"):

What I tried:

I figured it'd be easiest using ggplot2, but I don't even know where to start. I know I can create a simple bar chart with something like:

ggplot(d, aes(x=factor(PAIR), y=factor(PREFERENCE))) + geom_bar(position="fill")

… that however doesn't get me very far. So I tried this, and it gets me somewhat closer to what I'm trying to achieve, but it still uses the count of PREFERENCE, I suppose? Note the ylab being "count" here, and the values ranging to 19.

qplot(factor(PAIR), data=d, geom="bar", fill=factor(PREFERENCE_FIXED))

Results in:

enter image description here

So, what do I have to do to get the stacked bars to represent a histogram?
Or do they actually do this already?
If so, what do I have to change to get the labels right (e.g. have percentages instead of the "count")?

_{By the way, this is not really related to this question, and only marginally related to this (i.e. probably same idea, but not continuous values, instead grouped into bars).}

266

asked Jan 06 '12 12:01

slhck

1 Answers

Maybe you want something like this:

ggplot() + 
    geom_bar(data = dat,
             aes(x = factor(PAIR),fill = factor(PREFERENCE)),
             position = "fill")

where I've read your data into dat. This outputs something like this:

enter image description here

The y label is still "count", but you can change that manually by adding:

+ scale_x_discrete("Pairs") + scale_y_continuous("Votes")

103

answered Oct 30 '22 21:10

joran

Related questions
                            
                                Generating a graph with certain degree distribution?
                            
                                R stats - memory issues when allocating a big matrix / Linux
                            
                                Issue displaying PDF figures created with R on iOS devices
                            
                                Annotate ggplot2 graphs using tikzAnnotate in tikzDevice
                            
                                Find minimum of vector in Rcpp
                            
                                How to export an R citation output into endnote?
                            
                                Using tryCatch and source
                            
                                Extracting Country Name from Author Affiliations
                            
                                R: avoiding summary.plm
                            
                                get average column A based on a range of values in column B
                            
                                Apply function with outer taking the columns of two matrices as the elements of interest
                            
                                Assigning list attributes in an environment
                            
                                Do the R parallel extensions break the `apply` metaphor?
                            
                                Converting a data frame to a matrix with plyr daply
                            
                                why does sd in R return a vector for matrix input, and what can I do about it?
                            
                                Is stricter error reporting available in R?
                            
                                What does a "standard formula interface to a data.frame" mean in R?
                            
                                Moving values between rows without a for loop in R
                            
                                How is AIC calculated in stepAIC
                            
                                Removing all comments from an .r file?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How can I create a (100%) stacked histogram in R?

Tags:

plot

r

ggplot2

histogram