Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extending ggplot2 properly?

Recently a few neat uses of ggplot2 have come up, and either partial or full solutions have been posted:

  • ggheat
  • Curly braces
  • position_dynamic

ggheat is notable because it rather breaks the ggplot metaphor by just plotting rather than returning an object.

The curly brace solutions are notable because none really fits in the ggplot2 high-level concept (e.g. you should be specifying a range of points you want to breaks, and then somewhere else be able to specify the geom of how you want that range displayed--brace, box, purple cow, etc.).

The ggplot2 book (which I will order soon and have read the 2 online chapters) seems to be about using the grammar and functions rather than writing new ones or extensively extending existing ones.

I would like to learn to add a specific feature or develop a new geom, and do it properly. ggplot2 may not be intended as a general graphics package in the same way that grid or base graphics are, but there are a great many graphs which are only a step or two extension from an existing ggplot2 geom. When these situations come up, I can typically put together enough objects to do something once, but what if I need the same plot a few dozen times? What if other people like it and want to use it--now they have to kludge through the same process each time they want that graph. It seems to me that the proper solution is to add in a stat_heatplot and geom_heatplot, or to add a geom_Tuftebox for Tufte box plots, etc. Yet I've never seen an example of actually extending ggplot2; just examples of how to use it.

What resources exist to dig deeper into ggplot2 and start extending it? I'm particularly interested in a high-level way to specify a range on an axis as described above, but general knowledge about what makes ggplot2 tick is welcome as well.

Absent a coherent guide (which rarely exists for sufficiently advanced tinkering and therefore may not exist here), how would one go about learning about the internals? Inspecting source is obviously one way, but what functions to start with, etc.

like image 679
Ari B. Friedman Avatar asked Aug 11 '11 16:08

Ari B. Friedman


1 Answers

ggplot2 is gradually becoming more and more extensible. The development version, https://github.com/hadley/ggplot2/tree/develop, uses roxygen2 (instead of two separate homegrown systems), and has begun the switch from proto to simpler S3 classes (currently complete for coords and scales). These two changes should hopefully make the source code easier to understand, and hence easier for others to extend (backup by the fact that pull request for ggplot2 are increasing).

Another big improvement that will be included in the next version is Kohske Takahashi's improvements to the guide system (https://github.com/kohske/ggplot2/tree/feature/new-guides-with-gtable). As well as improving the default guides (e.g. with elegant continuous colour bars), his changes also make it easier to override the defaults with your own custom legends and axes. This would make it possible to draw the curly braces in the axes, where they probably belong.

The next big round of changes (which I probably won't be able to tackle until summer 2012) will include a rewrite of geoms, stats and position adjustments, along the lines of the sketch in the layers package (https://github.com/hadley/layers). This should make geoms, stats and position adjustments much easier to write, and will hopefully foster more community contributions, such as a geom_tufteboxplot.

like image 143
hadley Avatar answered Sep 22 '22 02:09

hadley