How to mutate multiple columns as function of multiple columns systematically?

Tags:

I have a tibble with a number of variables collected over time. A very simplified version of the tibble looks like this.

df = tribble(
~id, ~varA.t1, ~varA.t2, ~varB.t1, ~varB.t2,
'row_1', 5, 10, 2, 4,
'row_2', 20, 50, 4, 6
)

I want to systematically create a new set of variables varC so that varC.t# = varA.t# / varB.t# where # is 1, 2, 3, etc. (similarly to the way column names are setup in the tibble above).

How do I use something along the lines of mutate or across to do this?

644

asked Apr 10 '21 03:04

2 Answers

You can do something like this with mutate(across..., however, for renaming columns there must be a shortcut.

df %>% 
  mutate(across(.cols = c(varA.t1, varA.t2),
                .fns = ~ .x / get(glue::glue(str_replace(cur_column(), "varA", "varB"))),
                .names = "V_{.col}")) %>%
  rename_with(~str_replace(., "V_varA", "varC"), starts_with("V_"))

# A tibble: 2 x 7
  id    varA.t1 varA.t2 varB.t1 varB.t2 varC.t1 varC.t2
  <chr>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
1 row_1       5      10       2       4     2.5    2.5 
2 row_2      20      50       4       6     5      8.33

If there is a long time series you can also create a vector for .cols beforehand.

189

answered Oct 02 '22 00:10

AnilGoyal

I have a package on GitHub called {dplyover} which aims to solve this kind of problem in way similar to dplyr::across.

The function is called across2. It lets you define two sets of columns to which you can apply one or several functions. The .names argument supports two glue specifictions: {pre} and {suf}. They extract the shared pre- and suffix of the variable names. This makes it easy to put nice names on our output variables.

The function has one caveat. It is not performant when applied to highly grouped data (there is a vignette with benchmarks).

library(dplyr)
library(dplyover) # https://github.com/TimTeaFan/dplyover

df = tribble(
  ~id, ~varA.t1, ~varA.t2, ~varB.t1, ~varB.t2,
  'row_1', 5, 10, 2, 4,
  'row_2', 20, 50, 4, 6
)

df %>% 
  mutate(across2(starts_with("varA"),
                 starts_with("varB"),
                 ~ .x / .y,
                 .names = "{pre}C.{suf}"))

#> # A tibble: 2 x 7
#>   id    varA.t1 varA.t2 varB.t1 varB.t2 varC.t1 varC.t2
#>   <chr>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
#> 1 row_1       5      10       2       4     2.5    2.5 
#> 2 row_2      20      50       4       6     5      8.33

^{Created on 2021-04-10 by the reprex package (v0.3.0)}

answered Oct 02 '22 02:10

TimTeaFan

Related questions
                            
                                Randomly remove duplicated rows using dplyr()
                            
                                Join vectors into dataframe by matching values
                            
                                R loop over two or more vectors simultaneously - paralell
                            
                                How to sum list elements with the same name?
                            
                                Get text from href tag after specific class
                            
                                cbind a dynamic column name from a string in R
                            
                                Weighted logistic regression in R
                            
                                How to add title to a networkD3 visualisation when saving as a web page?
                            
                                Extract interaction terms from regression estimates
                            
                                In an array in R, how can we conduct subtraction in each element of the array?
                            
                                R regex to match beginning and end of string, ignoring middle
                            
                                Rename all column names with a suffix except listed column name using dplyr?
                            
                                Create a dataframe with list elements with dplyr in R
                            
                                Use gsub remove all string before first numeric character
                            
                                problem with sum function after inplace editing using Rcpp
                            
                                How to find if ANY column has a specific value I am looking for?
                            
                                how to slice data in lapply function
                            
                                In geom_sf_text, how to nudge x and y in aesthetics?
                            
                                How to multiply entire row with a matching row name in another dataframe?
                            
                                Conditionally pasting values from one column to another in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to mutate multiple columns as function of multiple columns systematically?

Tags:

r

dplyr

across

Maher Said

People also ask

2 Answers

AnilGoyal

TimTeaFan

Recent Activity

Donate For Us