I have a large data set with thousands of columns. The column names include various unwanted characters as follows:
col1_3x_xxx
col2_3y_xyz
col3_3z_zyx
I would like to remove all character strings starting with "_3" from all column names to be left with clean:
col1
col2
col3
What is the most efficient way to do this for 5000+ columns?
To remove a character in an R data frame column, we can use gsub function which will replace the character with blank. For example, if we have a data frame called df that contains a character column say x which has a character ID in each value then it can be removed by using the command gsub("ID","",as.
How to remove a character or multiple characters from a string in R? You can either use R base function gsub() or use str_replace() from stringr package to remove characters from a string or text.
To remove the string's first character, we can use the built-in substring() function in R. The substring() function accepts 3 arguments, the first one is a string, the second is start position, third is end position.
Use str_replace_all() method of stringr package to replace multiple string values with another list of strings on a single column in R and update part of a string with another string.
We can use sub
sub("_3.*", "", df1[,1])
#[1] "col1" "col2" "col3"
certainly late for this answer, but just in case someone is looking for a solution
colnames(df1)[col] <- sub("_3.*", "", colnames(df1)[col])
And if you have multiple columns :
for ( col in 1:ncol(df1)){
colnames(df1)[col] <- sub("_3.*", "", colnames(df1)[col])
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With