I have some survey data where people answered how much they strongly agree, agree, disagree, strongly disagree with different statements. Their responses could be any value (decimals included) between 1 and 4 (1 = strongly disagree, 2=disagree, etc...).
I want summarize this data by plotting the mean for each variable within a bar chart. I also want to change the Y axis labels to not be numeric values, but the labels at the anchor points of 1 = strongly disagree, 2 = disagree, etc...
Given the data included below, I can accomplish this with the following code:
ggplot(data = data, aes(x=factor(key), y=value, fill=key)) +
stat_summary(fun.y="mean", geom="bar", width = 0.5) +
stat_summary(aes(label=round(..y..,1)), fun.y="mean", geom="text", vjust = -0.5) +
geom_hline(yintercept = 3, linetype="solid", color = "red", size=1.5, alpha=0.25) +
scale_y_discrete(limits=c("Strongly Disagree", "Disagree", "Agree", "Strongly Agree"))
This is close to what I need, but I would really like to make the Y-axis start at 1 = Strongly Disagree instead of 0.
I was thinking that I could just subtract 1 from all of the numeric responses, but then my average score labels for each bar would be incorrect.
The only constraint that I have is that I would like to accomplish this task within ggplot
, and hopefully not by reshaping the original data. I have another chart like this where I used facet_wrap()
to create the same chart for each group (variable not included) within my dataset.
I've done much searching, but it seems changing the starting point of the axis in ggplot
is not something that is typically advised. However, given this situation, it think it sounds acceptable.
data <- structure(list(key = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L), .Label = c("Clarity", "Appropriateness", "Commitment"
), class = "factor"), value = c(NA, 3.33333333333333, 3.33333333333333,
4, 4, 3, 4, NA, 3, NA, 3, 4, NA, NaN, 3, 2.66666666666667, 3,
NA, 3.33333333333333, 3.66666666666667, 3.66666666666667, 4,
NA, 3, 4, 3.66666666666667, 3, 2.66666666666667, 3, 4, 4, 3,
3, NaN, 3, 4, 3, 4, 3, 4, 4, 2.33333333333333, 3, 4, 4, 3, 4,
3, 3, 3.33333333333333, 3, 4, 3, NA, 2.66666666666667, 3.33333333333333,
4, 2.33333333333333, 3.66666666666667, 4, 4, 3, NA, 3, 4, 3.2,
4, 3, 4, NA, 3.2, NA, 3, 4, NA, 4, 3, 3.4, 3, NA, 2.8, 3.6, 3.6,
3.8, NA, 3, 3.4, 3.2, 3, 3, 3.4, 3.8, 3.6, 3, 3, NaN, 2.4, 4,
3, 3.2, 3.2, 4, 4, 2.6, 3.8, 4, 4, 3.6, 3.2, 3, 3, 4, 2.8, 4,
3, NA, 3.4, 3.4, 4, 2.6, 3.8, 4, 3.4, 3, NA, 2.33333333333333,
4, 3.66666666666667, 4, 3, 4, NA, 3.33333333333333, NA, 4, 4,
NA, 4, 4, 2.33333333333333, 3.66666666666667, NA, 3, 4, 4, 4,
NA, 3.33333333333333, 3, 4, 3.33333333333333, 3.66666666666667,
3.33333333333333, 4, 4, 2.33333333333333, 3.66666666666667, NaN,
3, 4, 3, 3, 4, 3.66666666666667, 4, 3.33333333333333, 4, 3.66666666666667,
4, 4, 4, 3.66666666666667, 3, 3.33333333333333, 3.66666666666667,
3.66666666666667, 2.66666666666667, NA, 2.33333333333333, 3,
4, 3, 3.66666666666667, 4, 4, 4)), class = "data.frame", row.names = c(NA,
-186L))
In the case of bar charts, this means that the y-axis must always start at zero. The bars in a bar chart encode the data by their length, so if we truncate the length by starting the axis at something other than zero, we distort the visual in a bad way.
If you want the bar graph to go in descending order, put a negative sign on the target vector and rename the object. Then draw the bar graph of the new object.
It's very easy to create a horizontal bar chart. You just need to add the code coord_flip() after your bar chart code.
coord_cartesian()
gets the job done by plotting on the limited area while still retaining the data:
If you use the limits =
call in scale_y_continuous()
your plot would break.
ggplot(data = data, aes(x = key, y = value, fill = key)) +
stat_summary(fun.y = "mean", geom = "bar", width = 0.5) +
stat_summary(aes(label = round(..y.., 1)),
fun.y="mean", geom="text", vjust = -0.5) +
geom_hline(yintercept = 3, linetype = "solid",
color = "red", size = 1.5, alpha = 0.25) +
# limit the vertical space to 1 to 4, but keep the data
coord_cartesian(ylim = c(1, 4)) +
# set ticks at 1, 2, 3, 4
scale_y_continuous(breaks = c(1:4),
# label them with names
labels = c("Strongly Disagree", "Disagree",
"Agree", "Strongly Agree"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With