Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr pipes - How to change the original dataframe

When I don't use a pipe, I can change the original daframe using this command

df<-slice(df,-c(1:3))%>% # delete top 3 rows
df<-select(df,-c(Col1,Col50,Col51)) # delete specific columns

How would one do this with a pipe? I tried this but the slice and select functions don't change the original dataframe.

df%>%
  slice(-c(1:3))%>% 
  select(-c(Col1,Col50,Col51))

I'd like to change the original df.

like image 214
Silver.Rainbow Avatar asked Oct 25 '15 22:10

Silver.Rainbow


People also ask

How do I change the value of a dplyr?

Use mutate() and its other verbs mutate_all() , mutate_if() and mutate_at() from dplyr package to replace/update the values of the column (string, integer, or any type) in R DataFrame (data. frame). For more methods of this package refer to the R dplyr tutorial.

What does %>% mean in R studio?

%>% is called the forward pipe operator in R. It provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression.

Which dplyr function is used to add new columns based on existing values?

dplyr - R function to add multiple new columns based on values from a group of columns - Stack Overflow.

What does the pipe operator in the dplyr package do?

What is the Pipe Operator? The pipe operator is a special operational function available under the magrittr and dplyr package (basically developed under magrittr), which allows us to pass the result of one function/argument to the other one in sequence. It is generally denoted by symbol %>% in R Programming.


1 Answers

You can definitely do the assignment by using an idiom such as df <- df %>% ... or df %>% ... -> df. But you could also avoid redundancy (i.e., stating df twice) by using the magrittr compound assignment operator %<>% at the beginning of the pipe.

From the magrittr vignette:

The compound assignment pipe operator %<>% can be used as the first pipe in a chain. The effect will be that the result of the pipeline is assigned to the left-hand side object, rather than returning the result as usual.

So with your code, we can do

library(magrittr)  ## came with your dplyr install
df %<>% slice(-(1:3)) %>% select(-c(Col1, Col50, Col51))

This pipes df into the expression and updates df as the result.

Update: In the comments you note an issue setting the column names. Fortunately magrittr has provided functions for setting attributes in a pipe. Try the following.

df %<>% 
    set_colnames(sprintf("Col%d", 1:ncol(.))) %>% 
    slice(-(1:3)) %>%
    select(-c(Col1,Col50,Col51))

Note that since we have a data frame, we can also use setNames() (stats) or set_names() (magrittr) in place of set_colnames().


Thanks to Steven Beaupre for adding the note from the vignette.

like image 129
Rich Scriven Avatar answered Oct 18 '22 15:10

Rich Scriven