Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Improve poor automatic tick position choices without explicitly specifying breaks

Tags:

r

ggplot2

I find that ggplot2 sometimes produces too few tick marks when using scale_y_log10. I am trying to produce plots automatically from arbitrary data, and I'm looking for a way to increase the number of tick marks without explicitly specifying them (since I don't know ahead of time what the data will be). For instance, here's a function to create a simple scatterplot with a log-y-scale:

example_plot <- function(x) {
  p <- ggplot(d, aes(x=MW, y=rel.Ki)) + 
    geom_point() +
    scale_y_log10()
  p
}

This will often work well, but with the following data

d <- structure(list(MW = c(89.09, 174.2, 147.13, 75.07, 131.17, 131.17, 146.19, 149.21, 165.19, 115.13, 181.19, 117.15), rel.Ki = c(2.91438577473767, 1, 1.07761254731238, 1.0475715900998, 0.960123906592881, 0.480428471483881,  1.50210548081627, 0.318457530434953, 0.458477212731015, 1.92246139937586,  0.604121577795352, 2.4111345825694)), .Names = c("MW", "rel.Ki"), class = "data.frame", row.names = c(1L, 6L, 11L, 16L, 21L, 26L, 31L, 36L, 41L, 47L, 54L, 59L))

it produces

print(example_plot(d))

enter image description here

The single tick mark on the y axis is not very helpful. Is there any way I can prevent this situation, short of rewriting the automatic tick-position-picking function?

like image 785
Drew Steen Avatar asked Aug 28 '13 05:08

Drew Steen


People also ask

How do I increase the number of ticks in R?

Your answer You can try to override ggplots default scales by modifying scale_x_continuous and/or scale_y_continuous.

How do I change the number of tick marks in Ggplot?

The default value of Y-axis tick marks using ggplot2 are taken by R using the provided data but we can set it by using scale_y_continuous function of ggplot2 package. For example, if we want to have values starting from 1 to 10 with a gap of 1 then we can use scale_y_continuous(breaks=seq(1,10,by=1)).

How do I change the Y axis scale in ggplot2?

Use scale_xx() functions It is also possible to use the functions scale_x_continuous() and scale_y_continuous() to change x and y axis limits, respectively.


2 Answers

You can set the limits programmatically. For example, using the data you provide, we can define the limits in the function like this:

example_plot <- function(x){
  # identify the range of data
  lims <- c(10^floor(log10(min(x$rel.Ki, na.rm=TRUE))), 
    10^ceiling(log10(max(x$rel.Ki, na.rm=TRUE))))
  # require ggplot2
  require('ggplot2')
  # create the plot
  p <- ggplot(data = x, aes(x = MW, y = rel.Ki)) + 
    geom_point() +
    scale_y_log10(limits = lims)
  p
}

print(example_plot(d))

Then you get a plot with ticks at the nearest decade:

How to set limits programmatically

Then, if you want to add a logarithmic grid, use the breaks option to scale_y_log10() as Marius et al. suggest:

 example_plot <- function(x){
  # identify the range of data      
  lims <- c(10^floor(log10(min(x$rel.Ki, na.rm=TRUE))), 
            10^ceiling(log10(max(x$rel.Ki, na.rm=TRUE))))  
   # require ggplot2
  require('ggplot2')
  # create the plot
  p <- ggplot(data = x, aes(x = MW, y = rel.Ki)) + 
    geom_point() +
    scale_y_log10(breaks = pretty(x = lims, n = 5),
                  limits = lims) 
  p 
}

print(example_plot(d))

Personally I prefer logarithmic plots to show at least an order of magnitude variation, so this approach helps ensure that happens.

enter image description here

like image 28
Andy Clifton Avatar answered Sep 19 '22 13:09

Andy Clifton


An interesting discovery I just made by reading ?continuous_scale is that the breaks argument can be:

a function, that when called with a single argument, a character vector giving the limits of the scale, returns a character vector specifying which breaks to display.

So to guarantee a certain number of breaks, you could do something like:

break_setter = function(lims) {
  return(seq(from=as.numeric(lims[1]), to=as.numeric(lims[2]), length.out=5))
}

ggplot(d, aes(x=MW, y=rel.Ki)) + 
    geom_point() +
    scale_y_log10(breaks=break_setter)

Obviously the very simple example function is not very well adapted to the logarithmic nature of the data, but it does show how you could approach this a bit more programmatically.


You can also use pretty, which takes a suggestion for a number of breaks and returns nice round numbers. Using

break_setter = function(lims) {
    return(pretty(x = as.numeric(lims), n = 5))
}

yields the following:

logbreaks

Even better, we can make break_setter() return an appropriate function with whatever n you want and a default of, say, 5.

break_setter = function(n = 5) {
   function(lims) {pretty(x = as.numeric(lims), n = n)}
}

ggplot(d, aes(x=MW, y=rel.Ki)) + 
    geom_point() +
    scale_y_log10(breaks=break_setter())  ## 5 breaks as above

ggplot(d, aes(x=MW, y=rel.Ki)) + 
    geom_point() +
    scale_y_log10(breaks=break_setter(20))
like image 62
Marius Avatar answered Sep 19 '22 13:09

Marius