Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Make Y-axis start at 1 instead of 0 within ggplot bar chart

Tags:

r

ggplot2

I have some survey data where people answered how much they strongly agree, agree, disagree, strongly disagree with different statements. Their responses could be any value (decimals included) between 1 and 4 (1 = strongly disagree, 2=disagree, etc...).

I want summarize this data by plotting the mean for each variable within a bar chart. I also want to change the Y axis labels to not be numeric values, but the labels at the anchor points of 1 = strongly disagree, 2 = disagree, etc...

Given the data included below, I can accomplish this with the following code:

ggplot(data = data, aes(x=factor(key), y=value, fill=key)) + 
  stat_summary(fun.y="mean", geom="bar", width = 0.5) +
  stat_summary(aes(label=round(..y..,1)), fun.y="mean", geom="text", vjust = -0.5) +
  geom_hline(yintercept = 3, linetype="solid", color = "red", size=1.5, alpha=0.25) +
  scale_y_discrete(limits=c("Strongly Disagree", "Disagree", "Agree", "Strongly Agree"))

chart

This is close to what I need, but I would really like to make the Y-axis start at 1 = Strongly Disagree instead of 0.

I was thinking that I could just subtract 1 from all of the numeric responses, but then my average score labels for each bar would be incorrect.

The only constraint that I have is that I would like to accomplish this task within ggplot, and hopefully not by reshaping the original data. I have another chart like this where I used facet_wrap() to create the same chart for each group (variable not included) within my dataset.

I've done much searching, but it seems changing the starting point of the axis in ggplot is not something that is typically advised. However, given this situation, it think it sounds acceptable.


data <- structure(list(key = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L), .Label = c("Clarity", "Appropriateness", "Commitment"
), class = "factor"), value = c(NA, 3.33333333333333, 3.33333333333333, 
4, 4, 3, 4, NA, 3, NA, 3, 4, NA, NaN, 3, 2.66666666666667, 3, 
NA, 3.33333333333333, 3.66666666666667, 3.66666666666667, 4, 
NA, 3, 4, 3.66666666666667, 3, 2.66666666666667, 3, 4, 4, 3, 
3, NaN, 3, 4, 3, 4, 3, 4, 4, 2.33333333333333, 3, 4, 4, 3, 4, 
3, 3, 3.33333333333333, 3, 4, 3, NA, 2.66666666666667, 3.33333333333333, 
4, 2.33333333333333, 3.66666666666667, 4, 4, 3, NA, 3, 4, 3.2, 
4, 3, 4, NA, 3.2, NA, 3, 4, NA, 4, 3, 3.4, 3, NA, 2.8, 3.6, 3.6, 
3.8, NA, 3, 3.4, 3.2, 3, 3, 3.4, 3.8, 3.6, 3, 3, NaN, 2.4, 4, 
3, 3.2, 3.2, 4, 4, 2.6, 3.8, 4, 4, 3.6, 3.2, 3, 3, 4, 2.8, 4, 
3, NA, 3.4, 3.4, 4, 2.6, 3.8, 4, 3.4, 3, NA, 2.33333333333333, 
4, 3.66666666666667, 4, 3, 4, NA, 3.33333333333333, NA, 4, 4, 
NA, 4, 4, 2.33333333333333, 3.66666666666667, NA, 3, 4, 4, 4, 
NA, 3.33333333333333, 3, 4, 3.33333333333333, 3.66666666666667, 
3.33333333333333, 4, 4, 2.33333333333333, 3.66666666666667, NaN, 
3, 4, 3, 3, 4, 3.66666666666667, 4, 3.33333333333333, 4, 3.66666666666667, 
4, 4, 4, 3.66666666666667, 3, 3.33333333333333, 3.66666666666667, 
3.66666666666667, 2.66666666666667, NA, 2.33333333333333, 3, 
4, 3, 3.66666666666667, 4, 4, 4)), class = "data.frame", row.names = c(NA, 
-186L))
like image 683
CurtLH Avatar asked Dec 19 '18 01:12

CurtLH


People also ask

Can Y axis start at 1?

In the case of bar charts, this means that the y-axis must always start at zero. The bars in a bar chart encode the data by their length, so if we truncate the length by starting the axis at something other than zero, we distort the visual in a bad way.

How do you arrange a bar chart in descending order in R Ggplot?

If you want the bar graph to go in descending order, put a negative sign on the target vector and rename the object. Then draw the bar graph of the new object.

How do I make Ggplot horizontal?

It's very easy to create a horizontal bar chart. You just need to add the code coord_flip() after your bar chart code.


1 Answers

coord_cartesian() gets the job done by plotting on the limited area while still retaining the data:

1

If you use the limits = call in scale_y_continuous() your plot would break.

Code

ggplot(data = data, aes(x = key, y = value, fill = key)) + 
    stat_summary(fun.y = "mean", geom = "bar", width = 0.5) +
    stat_summary(aes(label = round(..y.., 1)), 
                 fun.y="mean", geom="text", vjust = -0.5) +
    geom_hline(yintercept = 3, linetype = "solid", 
               color = "red", size = 1.5, alpha = 0.25) +
    # limit the vertical space to 1 to 4, but keep the data
    coord_cartesian(ylim = c(1, 4)) +
                       # set ticks at 1, 2, 3, 4
    scale_y_continuous(breaks = c(1:4),
                       # label them with names
                       labels = c("Strongly Disagree", "Disagree",
                                  "Agree", "Strongly Agree"))
like image 190
Roman Avatar answered Sep 27 '22 01:09

Roman