I would like to correct a column in a data.frame by subtracting from it another column with nearly identical name, but this other column has a suffix. I would like to use the <code>mutate_at</code> function for this. Trying to figure this out, I have struggled to access the name of a column in the function part of mutate_at, to the use it to access the other column. I show this in a small example below, but basically I would like to access the name of the column used at the moment <code>.</code> and then select from the data in the pipe a column that has the same name as <code>.</code> but with a suffix (below that would be <code>"_new"</code>). Thanks for your help! Here is an example of how I would have liked to do it - but this does not work. <pre class="prettyprint"><code>library(tidyverse) data("mtcars") new <- mtcars/4 names(new) <-paste0(names(new),"_new") df <- bind_cols(mtcars,new) df %>% mutate_at(.vars = vars(carb,disp), .funs = list(corrected = ~ . - df %>% pull(paste0(names(.),"_new")))) df %>% pull(paste0("carb","_new")) </code></pre>

Instead of using <code>mutate_at</code> why not use <code>mutate</code> combined with <code>across</code> and <code>cur_column</code> i.e: <pre class="prettyprint lang-r prettyprint-override"><code>df %>% mutate( across( c(carb,disp), ~ . - pull(df, paste0(cur_column(), "_new") ), .names = "{.col}_corrected") ) </code></pre>

You can use <code>Map</code> in base R or <code>map2</code> from <code>purrr</code> for this : <pre class="prettyprint"><code>cols <- c('carb', 'disp') df[paste0(cols, '_corrected')] <- Map(`-`, df[cols], df[paste0(cols, '_new')]) </code></pre> Using <code>map2</code> <pre class="prettyprint"><code>library(purrr) df[paste0(cols, '_corrected')] <- map2(df[cols], df[paste0(cols, '_new')], `-`) </code></pre>

Here is a base R solution. We can define a function, <code>diff_col</code>, that creates the operation and then use a for loop to repeatedly apply this function to modify the data frame. <pre class="prettyprint"><code>diff_col <- function(col, dat){ dat[[paste0(col, "_corrected")]] <- dat[[col]] - dat[[paste0(col, "_new")]] return(dat) } for (name in c("carb", "disp")){ df <- diff_col(col = name, dat = df) } </code></pre>

dplyr: access column name in mutate_at function

Tags:

r

dplyr

I would like to correct a column in a data.frame by subtracting from it another column with nearly identical name, but this other column has a suffix. I would like to use the mutate_at function for this.

Trying to figure this out, I have struggled to access the name of a column in the function part of mutate_at, to the use it to access the other column.

I show this in a small example below, but basically I would like to access the name of the column used at the moment . and then select from the data in the pipe a column that has the same name as . but with a suffix (below that would be "_new").

Thanks for your help!

Here is an example of how I would have liked to do it - but this does not work.

library(tidyverse)
data("mtcars")
new <- mtcars/4
names(new) <-paste0(names(new),"_new")

df <- bind_cols(mtcars,new)

df %>% 
  mutate_at(.vars = vars(carb,disp),
            .funs = list(corrected = ~ . - df %>% pull(paste0(names(.),"_new"))))

df %>% pull(paste0("carb","_new"))

262

asked May 20 '20 00:05

Moritz Schwarz

5 Answers

Instead of using mutate_at why not use mutate combined with across and cur_column i.e:

df %>% 
  mutate( across( c(carb,disp), ~ . - pull(df, paste0(cur_column(), "_new") ),  .names = "{.col}_corrected") )

101

answered Sep 23 '22 23:09

Abdessabour Mtk

As the others already pointed out, variable names cannot be accessed in mutate_at and this holds also true for the coming mutate(across()). I addressed this issue as feature request for dplyr here, but apparently, this kind of data-wrangling task is too specialized for dplyr. Below I provide my favorite work-around for this kind of data-wrangling problems, which consists of two steps:

define a custom mutate function using !! rlang::sym() to generate variable based on a character vector of variable names
apply this custom function using purrr::reduce.

library(tidyverse)

# your toy data
df <- mtcars %>% 
         as_tibble %>% 
         mutate_all(list(new =~ ./4))

# step 1: generate helper function, in this case a simple `mutate` call

gen_corrected <- function(df, x) {

  mutate(df,
         "{x}_corrected" := !! rlang::sym(x) - !! rlang::sym(str_c(x, "_new"))
  )
}

# step 2:
# use purrr's `reduce` on the vector of vars you want to change
# the vector of variables can be defined in a separate step
# important: you need to set `.init = .`

df %>% 
  purrr::reduce(c('carb', 'disp'), gen_corrected, .init = .)
#> # A tibble: 32 x 24
#>      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb mpg_new
#>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl>
#>  1  21       6  160    110  3.9   2.62  16.5     0     1     4     4    5.25
#>  2  21       6  160    110  3.9   2.88  17.0     0     1     4     4    5.25
#>  3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1    5.7 
#>  4  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1    5.35
#>  5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2    4.68
#>  6  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1    4.53
#>  7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4    3.58
#>  8  24.4     4  147.    62  3.69  3.19  20       1     0     4     2    6.1 
#>  9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2    5.7 
#> 10  19.2     6  168.   123  3.92  3.44  18.3     1     0     4     4    4.8 
#> # … with 22 more rows, and 12 more variables: cyl_new <dbl>, disp_new <dbl>,
#> #   hp_new <dbl>, drat_new <dbl>, wt_new <dbl>, qsec_new <dbl>, vs_new <dbl>,
#> #   am_new <dbl>, gear_new <dbl>, carb_new <dbl>, carb_corrected <dbl>,
#> #   disp_corrected <dbl>

^{Created on 2020-05-21 by the reprex package (v0.3.0)}

In the github issue mention above @Romain Francois provides another work-around to this issue.

answered Sep 25 '22 23:09

TimTeaFan

We can't access the names of the . inside mutate_at because it is a vector and doesn't have the column name info. One option is map2

library(purrr)
library(dplyr)
library(stringr)
nm1 <- c('carb', 'disp')
map_dfc(nm1, ~ df %>%
                transmute(!!str_c(.x, '_corrected') := 
                    !! rlang::sym(.x) - !! rlang::sym(str_c(.x, "_new"))))%>%
   bind_cols(df, .) %>%
   head
# mpg cyl disp  hp drat    wt  qsec vs am gear carb mpg_new cyl_new disp_new hp_new drat_new  wt_new qsec_new
#1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4   5.250     1.5    40.00  27.50   0.9750 0.65500   4.1150
#2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4   5.250     1.5    40.00  27.50   0.9750 0.71875   4.2550
#3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1   5.700     1.0    27.00  23.25   0.9625 0.58000   4.6525
#4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1   5.350     1.5    64.50  27.50   0.7700 0.80375   4.8600
#5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2   4.675     2.0    90.00  43.75   0.7875 0.86000   4.2550
#6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1   4.525     1.5    56.25  26.25   0.6900 0.86500   5.0550
#  vs_new am_new gear_new carb_new carb_corrected disp_corrected
#1   0.00   0.25     1.00     1.00           3.00         120.00
#2   0.00   0.25     1.00     1.00           3.00         120.00
#3   0.25   0.25     1.00     0.25           0.75          81.00
#4   0.25   0.00     0.75     0.25           0.75         193.50
#5   0.00   0.00     0.75     0.50           1.50         270.00
#6   0.25   0.00     0.75     0.25           0.75         168.75

Or we can update with a for loop

for(nm in nm1) {
            df <- df %>%
                    mutate(!! str_c(nm, '_corrected') := 
                          !! rlang::sym(nm) - !! rlang::sym(str_c(nm, '_new')))
  }

Or using base R with lapply

 df[paste0(nm1, "_corrected")] <- lapply(nm1, function(nm)
                   df[nm] - df[paste0(nm, "_new")])

answered Sep 24 '22 23:09

akrun

You can use Map in base R or map2 from purrr for this :

cols <- c('carb', 'disp')
df[paste0(cols, '_corrected')] <- Map(`-`, df[cols], df[paste0(cols, '_new')])

Using map2

library(purrr)
df[paste0(cols, '_corrected')] <- map2(df[cols], df[paste0(cols, '_new')], `-`)

answered Sep 25 '22 23:09

Ronak Shah

Here is a base R solution. We can define a function, diff_col, that creates the operation and then use a for loop to repeatedly apply this function to modify the data frame.

diff_col <- function(col, dat){
  dat[[paste0(col, "_corrected")]] <- dat[[col]] - dat[[paste0(col, "_new")]]
  return(dat)
}

for (name in c("carb", "disp")){
  df <- diff_col(col = name, dat = df)
}

answered Sep 25 '22 23:09

www

Related questions
                            
                                Split and re-concatenate a string
                            
                                Retrieve Census tract from Coordinates [closed]
                            
                                dplyr lag with n from column values
                            
                                Center leaflet in a rmarkdown document
                            
                                Fixing the order of a Sankey flow graph in R / networkD3 package
                            
                                How to convert the result of xtabs() into dataframe in R? [duplicate]
                            
                                name character vectors with same name of list
                            
                                How to make in R matrix of intersections and unions over categories?
                            
                                How to split all strings in a column AND include prefix in all the new data
                            
                                Remove *all* duplicate rows, unless there's a "similar" row
                            
                                Create flag indicating if year variable is in the range of start:end variables in data.table
                            
                                Filter top n largest groups in data.frame
                            
                                Function to find if a value is greater than all prior values in a vector
                            
                                Compare two rows of a data.table and show only columns with differences [duplicate]
                            
                                Split string with repeated delimiters
                            
                                Create new variable by multiple conditions via mutate case_when
                            
                                R how to create columns/features based on existing data
                            
                                In R, how can I build a list with some elements referring to earlier elements in the list?
                            
                                Changing the conditions to replace elements in a vector
                            
                                Can we neatly align the regression equation and R2 and p value?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

dplyr: access column name in mutate_at function

Tags:

r

dplyr

Moritz Schwarz

People also ask

5 Answers

Abdessabour Mtk

TimTeaFan

akrun

Ronak Shah

www

Recent Activity

Donate For Us