I have a package with custom summary()
, print()
methods for objects that have a particular class. This package also uses the wonderful dplyr
package for data manipulation - and I expect my users to write scripts that use both my package and dplyr.
One roadblock, which has been noted by others here and here is that dplyr verbs doesn't preserve custom classes - meaning that an ungroup
command can strip my data.frames of their custom classes, and thus screw up method dispatch for summary
, etc.
Hadley says "doing this correctly is up to you - you need to define a method for your class for each dplyr method that correctly restores all the classes and attributes" and I'm trying to take the advice - but I can't figure out how to correctly wrap the dplyr verbs.
Here's a simple toy example. Let's say I've defined a cars
class, and I have a custom summary
for it.
library(tidyverse)
class(mtcars) <- c('cars', class(mtcars))
summary.cars <- function(x, ...) {
#gather some summary stats
df_dim <- dim(x)
quantile_sum <- map(mtcars, quantile)
cat("A cars object with:\n")
cat(df_dim[[1]], 'rows and ', df_dim[[2]], 'columns.\n')
print(quantile_sum)
}
summary(mtcars)
small_cars <- mtcars %>% filter(cyl < 6)
summary(small_cars)
class(small_cars)
that summary
call for small_cars
just gives me the generic summary, not my custom method, because small_cars
no longer retains the cars
class after dplyr filtering.
First I tried writing a custom method around filter
(filter.cars
). That didn't work, because filter
actually a wrapper around filter_
that allows for non-standard evaluation.
So I wrote a custom filter_
method for cars
objects, attempting to implement @jwdink 's advice
filter_.cars <- function(df, ...) {
old_classes <- class(df)
out <- dplyr::filter_(df, ...)
new_classes <- class(out)
class(out) <- c(new_classes, old_classes) %>% unique()
out
}
That doesn't work - I get an infinite recursion error:
Error: evaluation nested too deeply: infinite recursion / options(expressions=)?
Error during wrapup: evaluation nested too deeply: infinite recursion / options(expressions=)?
All I want to do is grab the classes on the incoming df, hand off to dplyr, then return the object with the same classnames as it had before the dplyr call. How do I change my filter_
wrapper to accomplish that? Thanks!
UPDATE:
Some things have changed since my original answer:
dplyr::filter
keeps the class. However, some — like dplyr::group_by
— still remove the class, so this question lives on.Recently ran into a hard-to-figure-out issues due to the second bullet, so just wanted to give a fuller example. Let's say you're using a custom class, with name custom_class
, and you want to add a groupby method. Assuming you're using roxygen:
#' group_by.custom_class
#'
#' @description Preserve the class of a `custom_class` object.
#' @inheritParams dplyr::group_by
#'
#' @importFrom dplyr group_by
#'
#' @export
#' @method group_by custom_class
group_by.custom_class <- function(.data, ...) {
result <- NextMethod()
return(reclass(.data, result))
}
(see original answer for definition of reclass
function)
Highlights:
@method group_by custom_class
to add S3method(group_by,custom_class)
to NAMESPACE@importFrom dplyr group_by
to add importFrom(dplyr,group_by)
to your NAMESPACEI believe in R < 3.5 you could get away with just that second one, but now you need both.
OLD ANSWER:
Further suggestions were offered in the thread so I thought I'd update with what seems to be best practice, which is to use NextMethod()
.
filter_.cars <- function(.data, ...) {
result <- NextMethod()
reclass(.data, result)
}
Where reclass
is written by you; it's just a generic that (at least) adds the original class back on:
reclass <- function(x, result) {
UseMethod('reclass')
}
reclass.default <- function(x, result) {
class(result) <- unique(c(class(x)[[1]], class(result)))
result
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With