Mutate across multiple columns to create new variable sets

Question

I have a panel dataset at the country and year level, and I'd like to create a two new variables based on existing ones.

year	country	var1	var2	var3	var 4	mean_var1	relmean_var1
1910	GER	1	4	10	6	3	0.333
1911	GER	2	3	11	7	1.5	1.3333
1910	FRA	5	6	8	9	3	1.66667
1911	FRA	1	4	10	9	1.5	.66667

What I'd like to do is create two new variables set : (1) a variable set of the average for each year (across countries) and (2) a variable set of the country value relative to the year-average. For example, for var1(1) would yield mean_var1 and (2) relmean_var1 and I'd want these for all the other variables. In total, there are over 1000 variables in the dataset, but I would only apply this function to about 6.

I have code that works for the first part, but I'd like to combine it as efficiently as possible with the second.

library(dplyr)
library(purrr)
df<- df%>% 
            group_by(year) %>%
            mutate_at(.funs = list(mean = ~mean(.)), .vars = c("var1", "var1", "var1", "var4"))

This code yields new variables called var1_mean (I would prefer mean_var1: how do I change this name?)

For the second step, I've tried:

df <- df %>%
map2_dfr(.x = d.test %>%
            select(var1, var2),
          .y = d.test %>%
            select(var1_mean, var2_mean), 
          ~ .x / .y) %>%
   setNames(c("relmean_var1", "relmean_var2"))

and I get errors

""Error in select(., var1, var2) : object 'd.test' not found."

. (I got this set up from this question)

I also tried:

 map2(var1, var1_mean, ~ df[[.x]] / df[[.y]]) %>% 
   set_names(cols) %>% 
   bind_cols(df, .)

And got

"Error in map2(var1, var1_mean, ~df[[.x]]/df[[.y]]) : object 'var1' not found

What's the best way to combine these two goals? Ideally with the naming scheme mean_var1 for (1) and relmean_var1 for (2)

Edit: input dataframe should look like this:

data <- tibble::tribble(
  ~year, ~country, ~var1, ~var2, ~var3, ~var.4,
  1910L,    "GER",    1L,    4L,   10L,     6L,
  1911L,    "GER",    2L,    3L,   11L,     7L,
  1910L,    "FRA",    5L,    6L,    8L,     9L,
  1911L,    "FRA",    1L,    4L,   10L,     9L
)

output dataframe should look like this (for all variables, just showing var1 as an example, but should be the same format for var2 through var4):

datanew  <- tibble::tribble(
  ~year, ~country, ~var1, ~var2, ~var3, ~var.4, ~mean_var1 , ~relmean_var1
  1910L,    "GER",    1L,    4L,   10L,     6L,     3L,        .3333L,
  1911L,    "GER",    2L,    3L,   11L,     7L,     1.5L,     1.3333L,
  1910L,    "FRA",    5L,    6L,    8L,     9L,     3L,       1.6667L,
  1911L,    "FRA",    1L,    4L,   10L,     9L      1.5L,      .6667L,
)

Ben · Accepted Answer

This might be easier in long format, but here's an option you can pursue as wide data.

Using the latest version of dplyr you can mutate across and include .names argument to define how your want your new columns to look.

library(tidyverse)

my_col <- c("var1", "var2", "var3", "var4")

df %>%
  group_by(year) %>%
  mutate(across(my_col, mean, .names = "mean_{col}")) %>%
  mutate(across(my_col, .names = "relmean_{col}") / across(paste0("mean_", my_col)))

Output

   year country  var1  var2  var3  var4 mean_var1 mean_var2 mean_var3 mean_var4 relmean_var1 relmean_var2 relmean_var3 relmean_var4
  <int> <chr>   <int> <int> <int> <int>     <dbl>     <dbl>     <dbl>     <dbl>        <dbl>        <dbl>        <dbl>        <dbl>
1  1910 GER         1     4    10     6       3         5         9         7.5        0.333        0.8          1.11         0.8  
2  1911 GER         2     3    11     7       1.5       3.5      10.5       8          1.33         0.857        1.05         0.875
3  1910 FRA         5     6     8     9       3         5         9         7.5        1.67         1.2          0.889        1.2  
4  1911 FRA         1     4    10     9       1.5       3.5      10.5       8          0.667        1.14         0.952        1.12

danlooo · Answer

library(tidyverse)

data <- tibble::tribble(
  ~year, ~country, ~var1, ~var2, ~var3, ~var.4,
  1910L,    "GER",    1L,    2L,   10L,     6L,
  1911L,    "GER",    2L,    3L,   11L,     7L,
  1910L,    "FRA",    5L,    6L,    8L,     9L,
  1911L,    "FRA",    1L,    3L,   10L,     9L
)

data_long <-
  data %>%
  pivot_longer(-c(year, country))

data_long
#> # A tibble: 16 x 4
#>     year country name  value
#>    <int> <chr>   <chr> <int>
#>  1  1910 GER     var1      1
#>  2  1910 GER     var2      2
#>  3  1910 GER     var3     10
#>  4  1910 GER     var.4     6
#>  5  1911 GER     var1      2
#>  6  1911 GER     var2      3
#>  7  1911 GER     var3     11
#>  8  1911 GER     var.4     7
#>  9  1910 FRA     var1      5
#> 10  1910 FRA     var2      6
#> 11  1910 FRA     var3      8
#> 12  1910 FRA     var.4     9
#> 13  1911 FRA     var1      1
#> 14  1911 FRA     var2      3
#> 15  1911 FRA     var3     10
#> 16  1911 FRA     var.4     9

means_country <-
  data_long %>%
  group_by(country) %>%
  summarise(mean_country_value = mean(value))

means_years <-
  data_long %>%
  group_by(year) %>%
  summarise(mean_year_value = mean(value))

data %>%
  left_join(means_country) %>%
  left_join(means_years)
#> Joining, by = "country"
#> Joining, by = "year"
#> # A tibble: 4 x 8
#>    year country  var1  var2  var3 var.4 mean_country_value mean_year_value
#>   <int> <chr>   <int> <int> <int> <int>              <dbl>           <dbl>
#> 1  1910 GER         1     2    10     6               5.25            5.88
#> 2  1911 GER         2     3    11     7               5.25            5.75
#> 3  1910 FRA         5     6     8     9               6.38            5.88
#> 4  1911 FRA         1     3    10     9               6.38            5.75

^{Created on 2021-11-24 by the reprex package (v2.0.1)}

Mutate across multiple columns to create new variable sets

Tags:

r

nested-loops

dplyr

purrr

mutation

PierreRoubaix

2 Answers

Ben

danlooo

Recent Activity

Donate For Us

Mutate across multiple columns to create new variable sets

Tags:

r

nested-loops

dplyr

purrr

mutation

PierreRoubaix

2 Answers

Ben

danlooo

Related questions

Recent Activity

Donate For Us