In previous versions of dplyr, if I wanted to get row counts in addition to other summary values using summarise()
, I could do something like
library(tidyverse)
df <- tibble(
group = c("A", "A", "B", "B", "C"),
value = c(1, 2, 3, 4, 5)
)
df %>%
group_by(group) %>%
summarise(total = sum(value), count = n())
`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 3 x 3
group total count
<chr> <dbl> <int>
1 A 3 2
2 B 7 2
3 C 5 1
My instinct to get the same output using the new across()
function would be
df %>%
group_by(group) %>%
summarise(across(value, list(sum = sum, count = n)))
Error: Problem with `summarise()` input `..1`.
x unused argument (col)
ℹ Input `..1` is `across(value, list(sum = sum, count = n))`.
ℹ The error occurred in group 1: group = "A".
The issue is specific to the n()
function, just calling sum()
works as expected:
df %>%
group_by(group) %>%
summarise(across(value, list(sum = sum)))
`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 3 x 2
group value_sum
<chr> <dbl>
1 A 3
2 B 7
3 C 5
I've tried out various syntactic variations (using lambdas, experimenting with cur_group()
, etc.), to no avail. How would I get the desired result within across()
?
We can use the lamdba function for n()
while the sum
can be invoked just by calling it if there are no other arguments to be specified
library(dplyr)
df %>%
group_by(group) %>%
summarise(across(value, list(sum = sum, count = ~ n())), .groups = 'drop')
-output
# A tibble: 3 x 3
# group value_sum value_count
# <chr> <dbl> <int>
#1 A 3 2
#2 B 7 2
#3 C 5 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With