Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to substract multiple .x from .y with same prefixes

I have this tibble:

# A tibble: 2 x 8
    a.x   b.x   c.x   d.x   a.y   b.y   c.y   d.y
  <int> <int> <int> <int> <int> <int> <int> <int>
1    13    13    12    11     7     1     4     2
2    17    11     0     0    16     2     0     0
df <- structure(list(a.x = c(13L, 17L), b.x = c(13L, 11L), c.x = c(12L, 
0L), d.x = c(11L, 0L), a.y = c(7L, 16L), b.y = 1:2, c.y = c(4L, 
0L), d.y = c(2L, 0L)), row.names = c(NA, -2L), class = c("tbl_df", 
"tbl", "data.frame"))

I want to calculate: a.x - a.y, b.x - b.y, c.x - c.y, and so on ....

My desired output:

    a.x   b.x   c.x   d.x   a.y   b.y   c.y   d.y     a     b     c     d
  <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
1    13    13    12    11     7     1     4     2     6    12     8     9
2    17    11     0     0    16     2     0     0     1     9     0     0

I can achieve this by:

df %>% 
    mutate(a = a.x-a.y,
           b = b.x-b.y,
           c = c.x-c.y,
           d = d.x-d.y)

I want to learn:

  1. How to extract the prefixes to new column names.
  2. Automate the calculation .x - .y.
like image 536
TarJae Avatar asked Jul 26 '21 20:07

TarJae


1 Answers

One method with cur_column - loop over the columns that ends_with .x, replace the substring in the column name (cur_column()) by changing the 'x' to 'y', get the value, subtract and change the column names in .names

library(dplyr)
library(stringr)
df %>% 
   mutate(across(ends_with('.x'),
     ~ . - get(str_replace(cur_column(), 'x', 'y')), 
         .names = "{str_remove(.col, fixed('.x'))}"))

-output

# A tibble: 2 x 12
    a.x   b.x   c.x   d.x   a.y   b.y   c.y   d.y     a     b     c     d
  <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
1    13    13    12    11     7     1     4     2     6    12     8     9
2    17    11     0     0    16     2     0     0     1     9     0     0

or with reshaping by pivot_longer

library(tidyr)
df %>%
     mutate(rn = row_number()) %>%
     pivot_longer(cols = -rn, names_to = c(".value"), 
          names_pattern = "(.)\\..*") %>% 
     group_by(rn) %>% 
     summarise(across(everything(), ~ -diff(.))) %>%
     select(-rn) %>%
     bind_cols(df, .)
# A tibble: 2 x 12
 a.x   b.x   c.x   d.x   a.y   b.y   c.y   d.y     a     b     c     d
  <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
1    13    13    12    11     7     1     4     2     6    12     8     9
2    17    11     0     0    16     2     0     0     1     9     0     0
like image 185
akrun Avatar answered Nov 02 '22 23:11

akrun