Using case_when with dplyr across

Tags:

dplyr

I'm trying to translate a mutate_at() to a mutate() using dplyr's new "across" function and a bit stumped.

In a nutshell, I need to compare the values in a series of columns to a "baseline" column. When the values in the columns are higher than the baseline, I need to use the baseline value. When the values in the columns are lower than or equal to the baseline, I need to keep the value. Here's an example dataset (my actual dataset is much larger):

test <- structure(list(baseline = c(5, 7, 8, 4, 9, 1, 0, 46, 47), bob = c(7, 
11, 34, 9, 6, 8, 3, 49, 12), sally = c(3, 5, 2, 2, 6, 1, 3, 4, 
56), rita = c(6, 4, 6, 7, 6, 0, 3, 11, 3)), class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -9L), spec = structure(list(
    cols = list(baseline = structure(list(), class = c("collector_double", 
    "collector")), bob = structure(list(), class = c("collector_double", 
    "collector")), sally = structure(list(), class = c("collector_double", 
    "collector")), rita = structure(list(), class = c("collector_double", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
    "collector")), skip = 1), class = "col_spec"))

My current code uses mutate_at() and works fine:

trial1 <- test %>% 
  mutate_at(
    vars('bob','sally', 'rita'),
    funs(case_when(
      . > baseline ~ baseline, 
      . <= baseline ~ .)))

But when I try to update it to reflect across() from dplyr 1.0, I keep getting an error. Here is my attempt:

trial2 <- test %>% 
  mutate(across(c(bob, sally, rita), 
                case_when(. > baseline ~ baseline, 
                          . <= baseline ~ .)))

And here is the error:

error: Problem with mutate() input ..1. x . > baseline ~ baseline, . <= baseline ~ . must be length 36 or one, not 9, 4. ℹ Input ..1 is across(...)

Any ideas what I might be doing wrong? Does case_when() work with across?

816

asked Oct 03 '20 22:10

James DeWeese

1 Answers

We can use the ~ to specify the anonymous function/lambda function call

library(dplyr)
test %>% 
   mutate(across(c(bob, sally, rita), 
             ~ case_when(. > baseline ~ baseline, 
                       . <= baseline ~ .)))

-output

# A tibble: 9 x 4
#  baseline   bob sally  rita
#     <dbl> <dbl> <dbl> <dbl>
#1        5     5     3     5
#2        7     7     5     4
#3        8     8     2     6
#4        4     4     2     4
#5        9     6     6     6
#6        1     1     1     0
#7        0     0     0     0
#8       46    46     4    11
#9       47    12    47     3

Or with .funs argument

test %>% 
        mutate(across(c(bob, sally, rita), 
                  .funs = case_when(. > baseline ~ baseline, 
                            . <= baseline ~ .)))

According to ?across the arguments to fns can be either

Functions to apply to each of the selected columns. Possible values are:

NULL, to returns the columns untransformed.

A function, e.g. mean.

A purrr-style lambda, e.g. ~ mean(.x, na.rm = TRUE)

A list of functions/lambdas, e.g. list(mean = mean, n_miss = ~ sum(is.na(.x))

Also, instead of case_when, we can make use of the pmin

test %>% 
    mutate(across(c(bob, sally, rita), ~ pmin(baseline, .)))

-output

# A tibble: 9 x 4
#  baseline   bob sally  rita
#     <dbl> <dbl> <dbl> <dbl>
#1        5     5     3     5
#2        7     7     5     4
#3        8     8     2     6
#4        4     4     2     4
#5        9     6     6     6
#6        1     1     1     0
#7        0     0     0     0
#8       46    46     4    11
#9       47    12    47     3

answered Oct 23 '22 10:10

akrun

Related questions
                            
                                Using Data from Environment in R Markdown [duplicate]
                            
                                Connect sparklyr to remote spark connection
                            
                                dplyr::mutate (assign na.rm =TRUE)
                            
                                Generate sequence with alternating increments in R? [duplicate]
                            
                                how to remove the negative values from a data frame in R
                            
                                MXNet package installation in R
                            
                                Error in installing packages 'RGtk2' and 'rattle' in R
                            
                                trimws bug? leading whitespace not removed
                            
                                How would you fit a gamma distribution to a data in R?
                            
                                Does PostgreSQL numeric type support infinity (and -infinity)?
                            
                                Why doesn't restarting R with Ctrl-Shift-F10 clear my environment variables?
                            
                                Looping over multiple lists with base R
                            
                                Extract substring and numbers from a string in R
                            
                                Loop to add new columns with ifelse
                            
                                Logistic Regression on factor: Error in eval(family$initialize) : y values must be 0 <= y <= 1
                            
                                How to extend the 'summary' function to include sd, kurtosis and skew?
                            
                                Creating a waffle plot together with facets in ggplot2
                            
                                when trying to install rgeos R cannot find -lgeos
                            
                                plot circle segment defined by three points with ggplot2
                            
                                Recoding a semicolon separated list in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With