I have a survey file in which row are observation and column question.
Here are some fake data they look like:
People,Food,Music,People P1,Very Bad,Bad,Good P2,Good,Good,Very Bad P3,Good,Bad,Good P4,Good,Very Bad,Very Good P5,Bad,Good,Very Good P6,Bad,Good,Very Good
My aim is to create this kind of plot with ggplot2
.
Here are my fake data:
raw <- read.csv("http://pastebin.com/raw.php?i=L8cEKcxS",sep=",") raw[,2]<-factor(raw[,2],levels=c("Very Bad","Bad","Good","Very Good"),ordered=FALSE) raw[,3]<-factor(raw[,3],levels=c("Very Bad","Bad","Good","Very Good"),ordered=FALSE) raw[,4]<-factor(raw[,4],levels=c("Very Bad","Bad","Good","Very Good"),ordered=FALSE)
But if I choose Y as count then I'm facing an issue about choosing the X and the Group values... I don't know if I can succeed without using reshape2
... I've also tired to use reshape with melt function. But I don't understand how to use it...
Then the user needs to call the geom_bar() function from the ggplot package with the required parameters into it to create the grouped bar plot in the R programming language. geom_bar() function: This function makes the height of the bar proportional to the number of cases in each group.
In order to create a stacked bar chart, also known as stacked bar graph or stacked bar plot, you can use barplot from base R graphics. Note that you can add a title, a subtitle, the axes labels with the corresponding arguments or remove the axes setting axes = FALSE , among other customization arguments.
EDIT: Eight years later...
This needs a tidyverse solution, so here is one, with all non-base packages explicitly stated so that you know where each function comes from (except for read.csv
which is from utils
which comes with base R):
library(magrittr) # needed for %>% if dplyr is not attached "http://pastebin.com/raw.php?i=L8cEKcxS" %>% read.csv(sep = ",") %>% tidyr::pivot_longer(cols = c(Food, Music, People.1), names_to = "variable", values_to = "value") %>% dplyr::group_by(variable, value) %>% dplyr::summarise(n = dplyr::n()) %>% dplyr::mutate(value = factor( value, levels = c("Very Bad", "Bad", "Good", "Very Good")) ) %>% ggplot2::ggplot(ggplot2::aes(variable, n)) + ggplot2::geom_bar(ggplot2::aes(fill = value), position = "dodge", stat = "identity")
The original answer:
First you need to get the counts for each category, i.e. how many Bads and Goods and so on are there for each group (Food, Music, People). This would be done like so:
raw <- read.csv("http://pastebin.com/raw.php?i=L8cEKcxS",sep=",") raw[,2]<-factor(raw[,2],levels=c("Very Bad","Bad","Good","Very Good"),ordered=FALSE) raw[,3]<-factor(raw[,3],levels=c("Very Bad","Bad","Good","Very Good"),ordered=FALSE) raw[,4]<-factor(raw[,4],levels=c("Very Bad","Bad","Good","Very Good"),ordered=FALSE) raw=raw[,c(2,3,4)] # getting rid of the "people" variable as I see no use for it freq=table(col(raw), as.matrix(raw)) # get the counts of each factor level
Then you need to create a data frame out of it, melt it and plot it:
Names=c("Food","Music","People") # create list of names data=data.frame(cbind(freq),Names) # combine them into a data frame data=data[,c(5,3,1,2,4)] # sort columns # melt the data frame for plotting data.m <- melt(data, id.vars='Names') # plot everything ggplot(data.m, aes(Names, value)) + geom_bar(aes(fill = variable), position = "dodge", stat="identity")
Is this what you're after?
To clarify a little bit, in ggplot multiple grouping bar you had a data frame that looked like this:
> head(df) ID Type Annee X1PCE X2PCE X3PCE X4PCE X5PCE X6PCE 1 1 A 1980 450 338 154 36 13 9 2 2 A 2000 288 407 212 54 16 23 3 3 A 2020 196 434 246 68 19 36 4 4 B 1980 111 326 441 90 21 11 5 5 B 2000 63 298 443 133 42 21 6 6 B 2020 36 257 462 162 55 30
Since you have numerical values in columns 4-9, which would later be plotted on the y axis, this can be easily transformed with reshape
and plotted.
For our current data set, we needed something similar, so we used freq=table(col(raw), as.matrix(raw))
to get this:
> data Names Very.Bad Bad Good Very.Good 1 Food 7 6 5 2 2 Music 5 5 7 3 3 People 6 3 7 4
Just imagine you have Very.Bad
, Bad
, Good
and so on instead of X1PCE
, X2PCE
, X3PCE
. See the similarity? But we needed to create such structure first. Hence the freq=table(col(raw), as.matrix(raw))
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With