I have this dataframe:
state county city region mmatrix X1 X2 X3 A1 A2 A3 B1 B2 B3 C1 C2 C3 1 1 1 1 111010 1 0 0 2 20 200 Push 8 12 NA NA NA 1 2 1 1 111010 1 0 0 4 NA 400 Shove 9 NA
Now I want to exclude columns whose names end with a certain string, say "1" (i.e. A1 and B1). I wrote this code:
df_redacted <- df[, -grep("\\1$", colnames(df))]
However, this seems to delete every column. How can I modify the code so that it only deletes the columns that matches the pattern (i.e. ends with "3" or any other string)?
The solution has to be able to handle a dataframe with has both numerical and categorical values.
Method 1: Using subset() This is one of the easiest approaches to drop columns is by using the subset() function with the '-' sign which indicates dropping variables. This function in R Language is used to create subsets of a Data frame and can also be used to drop columns from a data frame.
We can delete multiple columns in the R dataframe by assigning null values through the list() function.
To remove a character in an R data frame column, we can use gsub function which will replace the character with blank. For example, if we have a data frame called df that contains a character column say x which has a character ID in each value then it can be removed by using the command gsub("ID","",as.
To remove all columns with a common suffix from an R data frame, you need to use the grep() function. This function identifies and returns a vector with all columns that share the suffix. You can use this vector as an argument of the select() function to remove the columns from the data frame.
I found a simple answer using dplyr
/tidyverse
. If your colnames
contain "This", then all variables containing "This" will be dropped.
library(dplyr) df_new <- df %>% select(-contains("This"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With