Many R objects have S3 methods to plot associated with them. For instance, every R regression tutorial contains something like this:
dat <- data.frame(x=runif(10))
dat$y <- dat$x+runif(10)
my.lm <- lm( y~x, dat )
plot(my.lm)
Which displays regression diagnostics.
Similarly, I have an S3 object for a package which consists of a list which basically holds a few time series. I have a plot.myobject
method for it which reaches into the list, yanks out the time series, and plots them on the same graph. I would like to rewrite this as a ggplot2 function so that it will be prettier and perhaps more extensible as well.
Because this package is intended to get people without much R experience up and running quickly, I'd like this to be a one-liner with one argument, as in plot(myobject)
, ggplot(myobject)
, or whatever the appropriate version might be. Then once they get hooked, they can learn more about ggplot2
and customize the graph to their heart's content.
My initial temptation was to simply replace the internals of the plot.myobject
method to use ggplot2. This, however, seems like it might lose me major style points.
Is this a bad idea, and if so why and what alternative should I use?
There is an existing idiom in ggplot2
to do exactly what you propose. It is called fortify
. It takes an object and produces a version of the object in a form that ggplot can work with, i.e. a data.frame. Section 9.3 in Hadley's ggplot2 book describes how to do this, using the S3 object class lm
as an example. To see this in action, type fortify.lm
into your console to get the following code:
function (model, data = model$model, ...)
{
infl <- influence(model, do.coef = FALSE)
data$.hat <- infl$hat
data$.sigma <- infl$sigma
data$.cooksd <- cooks.distance(model, infl)
data$.fitted <- predict(model)
data$.resid <- resid(model)
data$.stdresid <- rstandard(model, infl)
data
}
<environment: namespace:ggplot2>
Here is my own example of writing a fortify
method for tree
, originally published on the ggplot2 mailing list
fortify.tree <- function(model, data, ...){
require(tree)
# Uses tree:::treeco to extract data frame of plot locations
xy <- tree:::treeco(model)
n <- model$frame$n
# Lines copied from tree:::treepl
x <- xy$x
y <- xy$y
node = as.numeric(row.names(model$frame))
parent <- match((node%/%2), node)
sibling <- match(ifelse(node%%2, node - 1L, node + 1L), node)
linev <- data.frame(x=x, y=y, xend=x, yend=y[parent], n=n)
lineh <- data.frame(x=x[parent], y=y[parent], xend=x,
yend=y[parent], n=n)
rbind(linev[-1,], lineh[-1,])
}
theme_null <- opts(
panel.grid.major = theme_blank(),
panel.grid.minor = theme_blank(),
axis.text.x = theme_blank(),
axis.text.y = theme_blank(),
axis.ticks = theme_blank(),
axis.title.x = theme_blank(),
axis.title.y = theme_blank(),
legend.position = "none"
)
And the plot code. Notice that the data passed to ggplot
is not a data.frame
but a tree
object.
library(ggplot2)
library(tree)
data(cpus, package="MASS")
cpus.ltr <- tree(log10(perf) ~ syct+mmin+mmax+cach+chmin+chmax, cpus)
p <- ggplot(data=cpus.ltr) +
geom_segment(aes(x=x,y=y,xend=xend,yend=yend,size=n),
colour="blue", alpha=0.5) +
scale_size("n", to=c(0, 3)) +
theme_null
print(p)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With