Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the limits to inheritance in ggplot2?

Tags:

r

ggplot2

eval

I have been trying to work out a few things about ggplot2, and how supplemntary arguments inherit from the first part ggplot(). Specifically, if inheritance is passed on beyond the geom_*** part.

I have a histogram of data:

ggplot(data = faithful, aes(eruptions)) + geom_histogram()

Which produces a fine chart, though the breaks are default. It appears to me (an admitted novice), that geom_histogram() is inheriting the data specification from ggplot(). If I want to have a smarter way of setting the breaks I could use a process like so:

ggplot(data = faithful, aes(eruptions)) + 
geom_histogram(breaks = seq(from = min(faithful$eruptions), 
                            to = max(faithful$eruptions), length.out = 10))

However, here I am re-specifying within the geom_histogram() function that I want faithful$eruptions. I have been unable to find a way to phrase this without re-specifying. Further, if I use the data = argument in geom_histogram(), and specify just eruptions in min and max, seq() still doesn't understand that I mean the faithful data set.

I know that seq is not part of ggplot2, but I wondered if it might be able to inherit regardless, as it is bound within geom_histogram(), which itself inherits from ggplot(). Am I just using the wrong syntax, or is this possible?

like image 527
DaveRGP Avatar asked May 19 '15 13:05

DaveRGP


2 Answers

Note that the term you are looking for is not "inheritance", but non standard evaluation (NSE). ggplot offers a couple of places where you can refer to your data items by their column names instead of a full reference (NSE), but those are the mapping arguments to the geom_* layers only, and even then when you are using aes. These work:

ggplot(faithful) + geom_point(aes(eruptions, eruptions))
ggplot(faithful) + geom_point(aes(eruptions, eruptions, size=waiting))

The following doesn't work because we are referring to waiting outside of aes and mapping (note first arg to geom_* is the mapping arg):

ggplot(faithful) + geom_point(aes(eruptions, eruptions), size=waiting)

But this works:

ggplot(faithful) + geom_point(aes(eruptions, eruptions), size=faithful$waiting)

though differently since now size is being interpreted litterally instead of being normalized as when part of mapping.

In your case, since breaks is not part of the aes/mapping spec, you can't use NSE and you are left using the full reference. Some possible work-arounds:

ggplot(data = faithful, aes(eruptions)) + geom_histogram(bins=10)  # not identical
ggplot(data=faithful, aes(eruptions)) + 
  geom_histogram(
    breaks=with(faithful,  # use `with`
      seq(from=max(eruptions), to=min(eruptions), length.out=10)
  ) )

And no-NSE, but a little less typing:

ggplot(data=faithful, aes(eruptions)) + 
  geom_histogram(
    breaks=do.call(seq, c(as.list(range(faithful$eruptions)), len=10))
  )
like image 185
BrodieG Avatar answered Nov 04 '22 04:11

BrodieG


Based on the ggplot2 documentation it seems that + operator which is really the +.gg function allows adding the following objects to a ggplot object: data.frame, uneval, layer, theme, scale, coord, facet

The geom function are functions that create layers which inherit the data and aes from the ggplot object "above" unless stated otherwise.

However the ggplot object and functions "live" in the Global environment, and thus calling a function such as seq which doesn't create a ggplot object from the ones listed above and doesn't inherit the ggplot object's themes (with the + operator which apply's to the listed above objects) lives in the global environment which doesn't include an object eruptions

like image 39
Yevgeny Tkach Avatar answered Nov 04 '22 02:11

Yevgeny Tkach