In dplyr, I want to exclude columns which contain the word "junk" but, there may not be any column that contain the word "junk". In that case, dplyr should return all columns. But it returns none. See unit test case below.
df<-data.frame(name=paste("name",1:5), age=1:5)
str(df)
# 'data.frame': 5 obs. of 2 variables:
# $ name: Factor w/ 5 levels "name 1","name 2",..: 1 2 3 4 5
# $ age : int 1 2 3 4 5
df1<-df%>%select(-contains("junk"))
str(df1)
# 'data.frame': 5 obs. of 0 variables
Where am I going wrong?
Drop column in R using Dplyr: Drop column in R can be done by using minus before the select function.
Deleting a column using dplyr is very easy using the select() function and the - sign. For example, if you want to remove the columns “X” and “Y” you'd do like this: select(Your_Dataframe, -c(X, Y)) .
The select() function of dplyr package is used to select variable names from the R data frame. Use this function if you wanted to select the data frame variables by index or position.
It works if you put everything()
before the -contains()
inside select
:
library(dplyr) # 0.4.1
df %>% select(everything(), -contains("junk"))
# name age
#1 name 1 1
#2 name 2 2
#3 name 3 3
#4 name 4 4
#5 name 5 5
However, I agree that it would be more intuitive if it worked without the need for everything()
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With