Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mutate_all except some columns

I have a dataframe containing a set of variables that I want to lag at different lenghts so that I can use them in regressions later on (instead of lagging one variable at a time manually).

I found this code on Stackoverflow that seems to do the trick:

df = data.frame(a = 1:10, b = 21:30)
dplyr::mutate_all(df, lag)
    a  b
1  NA NA
2   1 21
3   2 22
4   3 23
5   4 24
6   5 25
7   6 26
8   7 27
9   8 28
10  9 29

The problem is that this lags every column and I have some columns that I don't want to be lagged. How do I adapt the above code so that the columns I don't want to be lagged are excluded? And also how do i lag a different lenghts, now it only lags by 1 as the default setting

like image 856
Andycode Avatar asked Apr 20 '20 11:04

Andycode


1 Answers

I keep googling up this same Q&A and then noting that mutate_at() and mutate_if() are now superceded by across(), which provides a slightly easier-to-remember approach for the "mutate all except these columns" pattern

df = data.frame(a = 1:10, b = 21:30, c=31:40, d=41:50)
> df
    a  b  c  d
1   1 21 31 41
2   2 22 32 42
3   3 23 33 43
4   4 24 34 44
5   5 25 35 45
6   6 26 36 46
7   7 27 37 47
8   8 28 38 48
9   9 29 39 49
10 10 30 40 50
> # everythng but columns b and c
> df %>% mutate(across(!b & !c, lag))
    a  b  c  d
1  NA 21 31 NA
2   1 22 32 41
3   2 23 33 42
4   3 24 34 43
5   4 25 35 44
6   5 26 36 45
7   6 27 37 46
8   7 28 38 47
9   8 29 39 48
10  9 30 40 49
like image 91
mac Avatar answered Oct 31 '22 00:10

mac