I want to make a bar plot where one of the values is much bigger than all other values. Is there a way of having a discontinuous y-axis? My data is as follows:
df <- data.frame(a = c(1,2,3,500), b = c('a1', 'a2','a3', 'a4')) p <- ggplot(data = df, aes(x = b, y = a)) + geom_bar() p <- p + opts(axis.text.x=theme_text(angle= 90, hjust=1)) + coord_flip() p
Is there a way that I can make my axis run from 1- 10, then 490 - 500? I can't think of any other way of plotting the data (aside from transforming it, which I don't want to do)
8 years later, above code needs to be amended to work with version 3.1.1 of ggplot2
in order to create the same chart:
library(ggplot2) ggplot(df) + aes(x = b, y = a) + geom_col() + coord_flip()
To add axis breaks in ggplot2 plots in R, we use scale_x_break() and scale_y_break() functions. These functions take a vector as a parameter that has breakpoints. If we need multiple breakpoints we can add those too.
What is it? An axis break is a disruption in the continuity of values on either the y or x axis on a chart. It is also known as a scale break or graph break and is shown on the chart as a wavy line or diagonal line on the axis and on the bars plotted on that axis.
Displays a plot omitting one or two ranges of values on one axis. Typically used when there is a relatively large gap or two in the overall range of one set of values, often because of outliers. The function warns the user if any values may have been omitted by being in the "gap".
As noted elsewhere, this isn't something that ggplot2
will handle well, since broken axes are generally considered questionable.
Other strategies are often considered better solutions to this problem. Brian mentioned a few (faceting, two plots focusing on different sets of values). One other option that people too often overlook, particularly for barcharts, is to make a table:
Looking at the actual values, the 500 doesn't obscure the differences in the other values! For some reason tables don't get enough respect as data a visualization technique. You might object that your data has many, many categories which becomes unwieldy in a table. If so, it's likely that your bar chart will have too many bars to be sensible as well.
And I'm not arguing for tables all the time. But they are definitely something to consider if you are making barcharts with relatively few bars. And if you're making barcharts with tons of bars, you might need to rethink that anyway.
Finally, there is also the axis.break
function in the plotrix
package which implements broken axes. However, from what I gather you'll have to specify the axis labels and positions yourself, by hand.
Eight years later, the ggforce
package offers a facet_zoom()
extension which is an implementation of Hadley Wickham's suggestion to show two plots (as referenced in Brian Diggs' answer).
library(ggforce) ggplot(df) + aes(x = b, y = a) + geom_col() + facet_zoom(ylim = c(0, 10))
Unfortunately, the current version 0.2.2 of ggforce
throws an error with coord_flip()
so only vertical bars can be shown.
The zoomed facet shows the variations of the small values but still contains the large - now cropped - a4
bar. The zoom.data
parameter controls which values appear in the zoomed facet:
library(ggforce) ggplot(df) + aes(x = b, y = a) + geom_col() + facet_zoom(ylim = c(0, 10), zoom.data = ifelse(a <= 10, NA, FALSE))
Hadley Wickham suggested
I think it's much more appropriate to show two plots - one of all the data, and one of just the small values.
This code creates two plots
library(ggplot2) g1 <- ggplot(df) + aes(x = b, y = a) + geom_col() + coord_flip() g2 <- ggplot(df) + aes(x = b, y = a) + geom_col() + coord_flip() + ylim(NA, 10)
which can be combined into one plot by
cowplot::plot_grid(g1, g2) # or ggpubr::ggarrange(g1, g2)
or
gridExtra::grid.arrange(g1, g2) # or egg::ggarrange(g1, g2)
This was suggested in a comment by Chase and also by Brian Diggs in his answer who interpreted Hadley's suggestion to use
faceted plots, one with all the data, one zoomed in a particular region
but no code was supplied for this approach, so far.
As there is no simple way to scale facets separately (see related question, e.g.) the data needs to be manipulated:
library(dplyr) library(ggplot2) ggplot() + aes(x = b, y = a) + geom_col(data = df %>% mutate(subset = "all")) + geom_col(data = df %>% filter(a <= 10) %>% mutate(subset = "small")) + coord_flip() + facet_wrap(~ subset, scales = "free_x")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With