Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does ggplot calculate its default breaks?

Tags:

r

ggplot2

The title is relatively self explanatory. I would like to know how ggplot decides its default breaks (and hence labels).

From the below code, it looks like the method is the same for each geom:

library(ggplot2)

ggplot(data=mtcars,mapping=aes(x=carb,y=hp,fill=as.factor(gear)))+
  geom_bar(stat="identity",position="dodge")

ggplot(data=mtcars,mapping=aes(x=carb,y=hp,fill=as.factor(gear)))+
  geom_point()

Any help would be greatly appreciated

like image 943
T.Holme Avatar asked Jul 20 '16 16:07

T.Holme


People also ask

How do I specify a break in ggplot2?

The function scale_x_continuous() and scale_y_continuous() can be used for ggplot axis breaks settings.

How does ggplot function work?

ggplot2 is a plotting package that provides helpful commands to create complex plots from data in a data frame. It provides a more programmatic interface for specifying what variables to plot, how they are displayed, and general visual properties.

How do you add breaks in R?

To add axis breaks in ggplot2 plots in R, we use scale_x_break() and scale_y_break() functions. These functions take a vector as a parameter that has breakpoints. If we need multiple breakpoints we can add those too.

What does GG in ggplot represent?

The ggplot2 package is a relatively novel approach to generating highly informative publication-quality graphics. The “gg” stands for “Grammar of Graphics”.


1 Answers

I had the same question myself, and Google brought me to this SO question, so I thought I'd do a bit of digging.

Suppose we plot

library(ggplot2)
ggplot(mtcars, aes(x = cyl, y = mpg, size = hp)) +
  geom_point() 

which gives us the following plot, and we wish to know how the breaks for mpg (10, 15, ..., 35), cyl (4, 5, ..., 8), and hp (100, 150, ..., 300) are derived.

enter image description here

Focusing on mpg we inspect the code for scale_y_continuous and see that it calls continuous_scale. Then, calling up ?continuous_scale we see, under the description for the trans argument, that

A transformation object bundles together a transform, it's inverse, and methods for generating breaks and labels.

Then, looking up ?scales::trans_new, we see that the default value for the breaks argument is extended_breaks(). Following the trail, we find that scales::extended_breaks calls labeling::extended(rng[1], rng[2], n, only.loose = FALSE, ...). Applying this to our data,

with(mtcars, labeling::extended(range(mpg)[1], range(mpg)[2], m = 5))
# [1] 10 15 20 25 30 35

which is what we observe in the plot. This raises the question of why, despite

with(mtcars, labeling::extended(range(hp)[1], range(hp)[2], m = 5))
# [1]  50 100 150 200 250 300 350

we don't observe 50 and 350 in the legend. My understanding is that the answer is related to https://stackoverflow.com/a/13888731/6455166.

like image 106
Weihuang Wong Avatar answered Oct 08 '22 06:10

Weihuang Wong