Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr mutate/transmute: drop only the columns used in the formula

Tags:

r

dplyr

Suppose my data frame has columns A, B, C, D, E.

I want to produce a data frame with columns A, B, C, X, where X = D * E.

Obviously I can use %>% mutate(X = D * E) %>% select (-D, -E), but for more elaborate situations, is there a way to do it in a single command? Like transmute(), but only discarding the columns that were mentioned.

Silly, but I keep wishing for this bit of conciseness.

like image 774
David Kaufman Avatar asked Jul 19 '18 16:07

David Kaufman


People also ask

How do I remove columns from dplyr in R?

dplyr select() function is used to select the column and by using negation of this to remove columns. All verbs in dplyr package take data.

What is the difference between mutate and transmute in R?

mutate() adds new variables and preserves existing ones; transmute() adds new variables and drops existing ones. New variables overwrite existing variables of the same name.

What does %>% do in dplyr?

%>% is called the forward pipe operator in R. It provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. It is defined by the package magrittr (CRAN) and is heavily used by dplyr (CRAN).


1 Answers

There are now an experimental method added to mutate which allows you to do this in one operation:

df %>% mutate(X = D * E, .keep = "unused")

It is also possible to specify where the new variables goes between the others. See https://rdrr.io/github/tidyverse/dplyr/man/mutate.html

like image 91
Rasmus Ø. Pedersen Avatar answered Oct 11 '22 14:10

Rasmus Ø. Pedersen