ggplot
generally does a good job of creating sensible breaks and labels in scales.
However, I find that in plot with many facets and perhaps a formatter=
statement, the labels tend to get too "dense" and overprint, for example in this picture:
df <- data.frame( fac=rep(LETTERS[1:10], 100), x=rnorm(1000) ) ggplot(df, aes(x=x)) + geom_bar(binwidth=0.5) + facet_grid(~fac) + scale_x_continuous(formatter="percent")
I know that I can specify the breaks and labels of scales explicitly, by providing breaks=
and scale=
arguments to scale_x_continuous
.
However, I am processing survey data with many questions and a dozen crossbreaks, so need to find a way to do this automatically.
Is there a way of telling ggplot
to calculate breaks and labels automatically, but just have fewer, say at the minimum, maximum and zero point?
EDIT: Ideally, I don't want to specify the minimum and maximum points, but somehow tap into the built-in ggplot training of scales, and use the default calculated scale limits.
scale_x_log10() and scale_x_log10() are shortcuts for the base-10 logarithmic transformation of an axis. The same could be achieved by using, e.g., scale_x_continuous(trans = "log10") . The latter can take a selection of options, namely "reverse" , "log2" , or "sqrt" .
Example 1: Disable Scientific Notation of ggplot2 Axis We did that by applying the scale_x_continuous function in combination with the comma function of the scales package.
To alter the labels on the axis, add the code +labs(y= "y axis name", x = "x axis name") to your line of basic ggplot code. Note: You can also use +labs(title = "Title") which is equivalent to ggtitle .
It is useful to note that internally all scale functions in ggplot2 belong to one of three fundamental types; continuous scales, discrete scales, and binned scales.
You can pass in arguments such as min()
and max()
in your call to ggplot to dynamically specify the breaks. It sounds like you are going to be applying this across a wide variety of data so you may want to consider generalizing this into a function and messing with the formatting, but this approach should work:
ggplot(df, aes(x=x)) + geom_bar(binwidth=0.5) + facet_grid(~fac) + scale_x_continuous(breaks = c(min(df$x), 0, max(df$x)) , labels = c(paste( 100 * round(min(df$x),2), "%", sep = ""), paste(0, "%", sep = ""), paste( 100 * round(max(df$x),2), "%", sep = "")) )
or rotate the x-axis text with opts(axis.text.x = theme_text(angle = 90, hjust = 0))
to produce something like:
Update
In the latest version of ggplot2 the breaks
and labels
arguments to scale_x_continuous
accept functions, so one can do something like the following:
myBreaks <- function(x){ breaks <- c(min(x),median(x),max(x)) names(breaks) <- attr(breaks,"labels") breaks } ggplot(df, aes(x=x)) + geom_bar(binwidth=0.5) + facet_grid(~fac) + scale_x_continuous(breaks = myBreaks,labels = percent_format()) + opts(axis.text.x = theme_text(angle = 90, hjust = 1,size = 5))
The scales
package contains several breaks_*
and label_*
functions which return functions (closures) that are used by ggplot. So, you can write a wrappers for these that modify the output.
For example:
library(ggplot2) # Compute the list of breaks using original_func, # then remove any of these that occur in remove_list remove_breaks <- function(original_func, remove_list = list()) { function(x) { original_result <- original_func(x) original_result[!(original_result %in% remove_list)] } } # Compute the list of labels using original_func, # then remove any of these that occur in remove_list remove_labels <- function(original_func, remove_list = list()) { function(x) { original_result <- original_func(x) replace(original_result, original_result %in% remove_list, '') } } # Original plot ggplot(data.frame(x=c(1,2,3,4,5,6,7,8), y = c(1,4,9,16,25,36,49,64))) + geom_line(aes(x, y)) + scale_x_continuous(breaks = scales::breaks_pretty(9), minor_breaks = scales::breaks_pretty(18), labels = scales::label_number_auto()) + scale_y_continuous(breaks = scales::breaks_pretty(9), minor_breaks = scales::breaks_pretty(18), labels = scales::label_number_auto()) # Remove some breaks from the x-axis, and remove some labels from the y-axis ggplot(data.frame(x=c(1,2,3,4,5,6,7,8), y = c(1,4,9,16,25,36,49,64))) + geom_line(aes(x, y)) + scale_x_continuous(breaks = remove_breaks(scales::breaks_pretty(9), seq(3,6)), minor_breaks = remove_breaks(scales::breaks_pretty(18), seq(3,6,0.5)), labels = scales::label_number_auto()) + scale_y_continuous(breaks = scales::breaks_pretty(9), minor_breaks = scales::breaks_pretty(18), labels = remove_labels(scales::label_number_auto(), seq(20, 30)))
Of course, with my simple remove_breaks
and remove_labels
functions you still have to specify which values to remove, but you can easily modify these to something that removes the max and min value, removes any value in a specified range, etc.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With