I want to create a custom function for the collapse package where I supply unquoted grouping variables like so:
library(collapse)
library(tidyverse)
fgroup_by_no_entries <- function(df, ...) {
df %>%
fgroup_by(...) %>%
fmutate(no_of_entries = GRPN()) %>%
fungroup()
}
mtcars %>%
fgroup_by_no_entries(., c(cyl, gear))
# incorrect solution
# or
fgroup_by_no_entries <- function(df, grouping_vars) {
grouping_vars <- ensyms(grouping_vars)
# or
# grouping_vars <- as.list(substitute(grouping_vars))[-1]
df %>%
fgroup_by(!!!grouping_vars) %>%
fmutate(no_of_entries = GRPN()) %>%
fungroup()
}
mtcars %>%
fgroup_by_no_entries(., c(cyl, gear))
# doesnt work
If I was using dplyr, I could use across like so:
group_by_no_entries <- function(df, grouping_vars) {
df %>%
group_by(across({{grouping_vars}})) %>%
mutate(no_of_entries = n()) %>%
ungroup()
}
mtcars %>%
group_by_no_entries(., c(cyl, gear))
but it seems across within collaspe can only be used within fmutate and fsummarise, see here.
How do I capture grouping names within a collapse function?
thanks
What I think is happening is that fgroup_by (or GRP) does not handle (yet) non-standard evaluation. Similarly, group_by would fail without the curly braces {{. I am not sure why you get this result (when using c).
The problem is that {{ is a tidyverse-specific non-standard evaluation function, that collapse has not yet implemented. I am not sure how collapse deals with non-standard evaluation, but the simplest way I could find was to use collapse::.c instead of c, using your first function.
.c is a "small helper function" that allows non-standard (that is, not quoted) concatenation, such that .c(a, b) == c("a", "b").
mtcars %>%
fgroup_by_no_entries(., .c(cyl, gear))
all.equal(
fgroup_by_no_entries(mtcars, .c(cyl, gear)) |>
getElement("no_of_entries"),
group_by_no_entries(mtcars, .c(cyl, gear)) |>
getElement("no_of_entries")
)
#[1] TRUE
One way to do this is to capture the function's arguments and convert calls of type c(x, y) to c("x", "y"). This is assuming that you don't want to use dplyr::group_by() which is compatible with collapse::fmutate() which is probably the easiest thing to do but not quite as fast.
library(collapse)
fgroup_by_no_entries <- function(df, ...) {
args <- substitute(alist(df, ...))
args[-1] <- lapply(args[-1], \(x) {
if (is.call(x) && identical(x[[1]], quote(c))) {
x[-1] <- as.character(x[-1])
}
x
})
do.call(fgroup_by, eval(args)) |>
fmutate(no_of_entries = GRPN()) |>
fungroup()
}
mtcars |>
fgroup_by_no_entries(c(gear, carb)) |>
head(4)
mpg cyl disp hp drat wt qsec vs am gear carb no_of_entries
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 4
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 3
This also works with the other selecting options offered by fgroup_by() e.g.
mtcars |>
fgroup_by_no_entries(gear:carb) |>
head(4)
mpg cyl disp hp drat wt qsec vs am gear carb no_of_entries
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 4
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 3
mtcars |>
fmutate(across(gear:carb, as.factor)) |>
fgroup_by_no_entries(is.factor) |>
head(4)
mpg cyl disp hp drat wt qsec vs am gear carb no_of_entries
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 4
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With